开发者

std::getline for a comma delimited table file with quotations around certain fields

I'm basically running the following code. This code goes through line by line and grabs the different fields of a generic comma-delimited table file. My problem is that sometimes the "title" field can have commas in it. When it does, it's surrounded by quotation marks like so: "this, this is my title". But when my code sees the comma, it just treats everything after it like the next field. Not all titles have quotes around them, only the ones with commas in them. My problem is that I have no idea how to make the code check for this.... How can I get my code to check for this issue?

Thanks a lot, yall. It kind of means a lot to my gainful employment!

while (getline(BookLine, ImpLine, '\n'))  // Get each line
{
   // create a string stream from the standard string
   std::istringstream S开发者_如何学编程trLine(ImpLine);

   std::string
   bookNumber,
   chk,
   author,
   title,
   edition;

   // Parse lines
   std::getline(StrLine,bookNumber,',');
   std::getline(StrLine,chk,',');
   std::getline(StrLine,author,',');
   std::getline(StrLine,title,',');            
   std::getline(StrLine,edition,',');
}


Doing this well is kind of complex. Basically, you read the first character. If it's not a quote, then you read to the next comma. If it is a quote, you read to the next quote. Then you peek at the next character, and see if it's another quote. If it is, you read to the next quote again, and add what you read onto the end of what you read the first time, but without one of the quotes (i.e., a quote in a quoted string is represented by two consecutive quote marks). When you get to a quote that's followed by something other than a quote (should normally be a comma) you're reached the end of that field.


Haven't tested it, but roughly you want...

std::vector<string> values;
std::string value;
bool in_quoted = false;

for (const char* p = ImpLine.c_str(); *p; ++p)
    if (*p == ',' && !in_quoted)
    {
        values.push_back(value);
        value.clear();
    }
    else if (*p == '"')
        if (in_quoted)
            if (p[1] == '"')
                value += *++p;
            else
                in_quoted = false;
        else
            in_quoted = true;
    else
        value += *p;

values.push_back(value);

(You may want to tweak it to trim the fields of surrounding whitespace.)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜