Regexp & back references
Guys I need help to extract from string
"AAA, BBB", "CCC", DDDD
follow开发者_如何学Going groups:
- AAA, BBB
- CCC
- DDDD
Is possible extract three groups by regex and if yes then how?
Thanks.
The function
public void RunTest()
{
const string toTest = "\"AAA, BBB\", \"CCC\", \"DDDD\"";
var exp = new Regex("\G(?:^|,)\s*\"([^\"])\"");
var matches = exp.Matches(toTest);
foreach (var match in matches.Cast())
{
Console.WriteLine(@"Matched expression: {0}", match);
foreach (var group in match.Groups.Cast())
{
Console.WriteLine(@"Matched group: {0}", group);
}
}
}
will return
Matched expression: "AAA, BBB"
Matched group: "AAA, BBB"
Matched group: AAA, BBB
Matched expression: , "CCC"
Matched group: , "CCC"
Matched group: CCC
Matched expression: , "DDDD"
Matched group: , "DDDD"
Matched group: DDDD
so collecting every second group you get what I suppose you wanted.
Note that I added double quotes around your DDDD. I thought that was a typo.
If its not a typo you can try the regular expression:
var exp = new Regex("\G(?:^|,)(?:\s(?:\"([^\"])\")|([^\",]))");
Explanations:
\G The match must occur at the point where the previous match ended.
[^"] Any character except the double quote
\s Any whitespace
* zero or more occurrences of the preceding element
( and ) define a group
(?: defines a noncapturing group
Hope that helps :)
var delimiterPattern = @",(?=(?:[^\"]\"[^\"]\")(?![^\"]\"))";
var parts = string.Split(delimiterPattern);
for string: toys "r" us", "AAAA", "toys "r" us","toys ,r,","toys ,"r",",test
will return: 1. "toys "r" us" 2. "AAAA" 3. "toys "r" us" 4. "toys ,r," 5. "toys ,"r"," 6. test
精彩评论