How to check if the first char in the line is # (beginning of a comment)
I have been following this convention thus far:
std::string line;
while(std::getline(in,line))
{
if(line.size() && line[0] =='#')
continue;
/* parse text*/
}
The obvious drawback is that comment may not begin at the first character, in the case of leading white开发者_StackOverflow社区space.
What is the good way to deal with this sort of a thing?
Simple enhancement: you may want to use line.find_first_not_of(" ") to get the first non-whitespace and then check if that is a '#'. That would also cover to the zero length case. Something like this fragment:
found= line.find_first_not_of(" \t");
if( found != string::npos)
{
if( line[found] == '#')
continue;
}
More info
Use the operator >>. It ignores whitespace.
std::string line;
while(std::getline(in,line))
{
std::stringstream linestr(line);
char firstNoWhiteSpaceChar;
linestr >> firstNoWhiteSpaceChar;
if ((!linestr) || (firstNoWhiteSpaceChar == '#'))
{
// If line contains only white space then linestr will become invalid.
// As the equivalent of EOF is set. This is the same as a comment so
// we can ignore the line like a comment.
continue;
}
// Do Stuff with line.
}
You should make sure to check the string length before testing character zero:
if (line.length() > 0 && line[0] == '#')
From the sound of things, your file format specifies that everything from '#' to the end of a line is a comment. If that's the case, you can find the beginning of the comment with:
// Warning: untested code.
int pos = line.find('#');
Then, you presumably want to ignore the rest of the line, most easily managed by deleting it:
if (pos != std::string::npos)
line.erase(pos, -1);
This should deal quite easily with things like:
tax = rate * price # figure tax on item
Of course, this assumes that a '#' always signals the beginning of a comment -- if you allow '#' inside of characters strings, or for whatever other purpose, you'll need to take that into account (but it's hard to guess what that would be since you've told us very little about the file format).
Use the stream's facility to skip whitespace, std::ws
:
inline std::istream& get_line(std::istream& in, std::string& line)
{
in >> std::ws;
std::getline(in,line);
return in;
}
std::string line;
while(get_line(in,line))
{
if(!line.empty() && line[0] =='#')
continue;
/* parse text*/
}
You might like the Boost String Library, specifically trim_left
and starts_with
.
Parsing is tricky and difficult.
In general, I would not recommend trying to parse without a state machine. For example, what if the '#' is part of a multiline ("""...""" in python) ?
There are libraries that exist which may simplify parsing (well, they are supposed to, but understanding them might prove challenging if you have no prior inkling), for example, in C++, one can only recommend Spirit.
There are already been some pointers suggested to help you using string methods, though they only related to detecting if the first meaningful character is a '#'.
If you do not 'fear' multiline (that is if what you are trying to parse does not have such a feature), you will still have to manage 'simple' lines, which can be done by counting, taking escapes into account:
print "My \"#\" is: ", phoneNumber # This is my phone number
If you parse this line badly, you'll end up with an error... (for example)
If you cannot use a library, a state machine is the way to go, writing a parser is quite fun in general, it gives you insights as to why the notation has been developed in a certain way.
精彩评论