ATL regex to parse csv files
Can some tell me what is wrong with the below code, I am trying to parse CSV files using the below program but it returns zero in m_uNumGroups
field.
int _tmain(int argc, _TCHAR* argv[])
{
CAtlRegExp<> reUrl;
// Five match groups: scheme, authority, path, query, fragment
REParseError status = reUrl.Parse(**L"[^\",]+|(?:[ˆ\"])|\"\")"**);
if (REPARSE_ERROR_开发者_运维技巧OK != status)
{
// Unexpected error.
return 0;
}
TCHAR testing[ ] = L"It’ s \" 10 Grand\" , baby";
CAtlREMatchContext<> mcUrl;
if (!reUrl.Match(testing,&mcUrl))
{
// Unexpected error.
return 0;
}
for (UINT nGroupIndex = 0; nGroupIndex < mcUrl.m_uNumGroups;nGroupIndex)
{
const CAtlREMatchContext<>::RECHAR* szStart = 0;
const CAtlREMatchContext<>::RECHAR* szEnd = 0;
mcUrl.GetMatch(nGroupIndex, &szStart, &szEnd);
ptrdiff_t nLength = szEnd - szStart;
printf_s("%d: \"%.*s\"\n", nGroupIndex, nLength, szStart);
}
return 0;;
}
With ATL regular expression syntax you need to use curly brackets around the expression you are catching. Your expression does not have any, so you're doing just match without sbu-expressions.
Check this out: http://msdn.microsoft.com/en-us/library/k3zs4axe%28v=vs.80%29.aspx
{ } Indicates a match group. The actual text in the input that matches the expression inside the braces can be retrieved through the CAtlREMatchContext object.
I don't know C++, but if you're trying to parse "It’ s \" 10 Grand\" , baby"
into It’ s \" 10 Grand\"
and baby
, then this fails for several reasons:
- because that string is not valid CSV syntax. In CSV, quotes within fields need to be escaped by doubling (yours aren't escaped at all, only at string level), and fields that contain quotes must be surrounded by quotes. A valid CSV string would be
"\"It’ s \"\" 10 Grand\"\"\", baby"
. - because your regex is wrong. Parsing CSV with regexes is hard, if not impossible, because of all the gotchas involved. Search StackOverflow for
csv regex
and find out that you should use a CSV parser instead.
精彩评论