Avoid grabbing nothing from string stream
I'm working on an assembler for a very basic ISA. Currently I'm implementing parser function and I'm using a string stream to grab words from lines. Here's an example of the assembly code:
; This program counts from 10 to 0
.ORIG x3000
LEA R0, TEN ; This instruction will be loaded into memory location x3000
LDW R1, R0, #0
START ADD R1, R1, #-1
BRZ DONE
BR START
; blank line
DONE TRAP x25 ; The last executable instruction
TEN .FILL x000A ; This is 10 in 2's comp, hexadecimal
.END
Don't worry about the nature of the assembly code, simply look at line 3, the one with the comment to the right. My parser functions aren't complete, but here's what I have:
// Define three conditions to code
enum {DONE, OK, EMPTY_LINE};
// Tuple containing a condition and a string vector
typedef tuple<int,vector<string>> Code;
// Passed an alias to a string
// Parses the line passed to it
Code ReadAndParse(string& line)
{
/***********************************************/
/****************REMOVE COMMENTS****************/
/***********************************************/
// Sentinel to flag down position of first
// semicolon and the index position itself
bool found = false;
size_t semicolonIndex = -1;
// Conve开发者_Go百科rt the line to lowercase
for(int i = 0; i < line.length(); i++)
{
line[i] = tolower(line[i]);
// Find first semicolon
if(line[i] == ';' && !found)
{
semicolonIndex = i;
// Throw the flag
found = true;
}
}
// Erase anything to and from semicolon to ignore comments
if(found != false)
line.erase(semicolonIndex);
/***********************************************/
/*****TEST AND SEE IF THERE'S ANYTHING LEFT*****/
/***********************************************/
// To snatch and store words
Code code;
string token;
stringstream ss(line);
vector<string> words;
// While the string stream is still of use
while(ss.good())
{
// Send the next string to the token
ss >> token;
// Push it onto the words vector
words.push_back(token);
// If all we got was nothing, it's an empty line
if(token == "")
{
code = make_tuple(EMPTY_LINE, words);
return code;
}
}
/***********************************************/
/***********DETERMINE OUR TYPE OF CODE**********/
/***********************************************/
// At this point it should be fine
code = make_tuple(OK, words);
return code;
}
As you can see, the Code tuple contains a condition represented in the enum decleration and vector containing all words in the line. What I want is to have every word in a line pushed into the vector and then returned.
The issue arises on the third call of the function (the third line of the assembly code). I use the ss.good() function to determine if I have any words in the string stream. For some reason the ss.good() function returns true even though there is no fourth word in the third line and I end up having the words [lea] [r0,] [ten] and [ten] pushed into the vector. ss.good() is true on the fourth call and token receives nothing, thus I have [ten] pushed into the vector twice.
I notice if I remove the spaces between the semicolon and the last word, this error doesn't occur. I want to know how to get the right number of words pushed into the vector.
Please don't recommend Boost library. I love the library, but I want to keep this project simple. This is nothing big, there's only a dozen instructions for this processor. Also, bear in mind that this function is only half-baked, I'm testing and debugging it incrementally.
The stream's error flags only get set after the condition (such as reaching the end of the stream) has occurred.
Try replacing your loop condition with:
while(ss >> token)
{
// Push it onto the words vector
words.push_back(token);
// If all we got was nothing, it's an empty line
if(token == "")
{
code = make_tuple(EMPTY_LINE, words);
return code;
}
}
With this code, I get the following tokens for line 3:
"LEA"
"R0,"
"TEN"
";"
"This"
"instruction"
"will"
"be"
"loaded"
"into"
"memory"
"location"
"x3000"
I know the language you're trying to parse is a simple one. Nonetheless you would do yourself a favour if you would consider using a specialized tool for the job such as, for example, flex
.
精彩评论