Grouping string by comma between brackets
Response to : Regular Expression to find a string included between two characters while EXCLUDING the delimiters
Hi,I'm looking for a regex pattern that applies to my string including brackets:[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7] could be anything including word,digit,non-word together or separated.
I wish to get the group between brackets by \[(.*?)\]
but what is the regex pattern that will give me the group between brackets and sub-group strings separated by commas so that the result may be following ??
Group1 : 1,2,3,4,5 Group1: 1 Group2: 2 Group3: 3 Group4: 4 Group5: 5 Group2 : abc,ef,g Group1: abc Group2: ef Group3: g etc ..
Thank开发者_开发知识库 you for your help
I agree with @Dav that you would be best using String.Split on each square-bracketed group.
However, you can extract all the data using a single regular expression:
(?:\s*\[((.*?)(?:,(.+?))*)\])+
Using this expression, you will have to process all the captures of each group to get all the data. As an example, run the following code on your string:
var regex = new Regex(@"(?:\s*\[((.*?)(?:,(.+?))*)\])+");
var match = regex.Match(@"[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7]");
for (var i = 1; i < match.Groups.Count; i++)
{
var group = match.Groups[i];
Console.WriteLine("Group " + i);
for (var j = 0; j < group.Captures.Count; j++)
{
var capture = group.Captures[j];
Console.WriteLine(" Capture " + j + ": " + capture.Value
+ " at " + capture.Index);
}
}
This produces the following output:
Group 1 Capture 0: 1,2,3,4,5 at 1 Capture 1: abc,ef,g at 13 Capture 2: 0,2,4b,y7 at 24 Group 2 Capture 0: 1 at 1 Capture 1: abc at 13 Capture 2: 0 at 24 Group 3 Capture 0: 2 at 3 Capture 1: 3 at 5 Capture 2: 4 at 7 Capture 3: 5 at 9 Capture 4: ef at 17 Capture 5: g at 20 Capture 6: 2 at 26 Capture 7: 4b at 28 Capture 8: y7 at 31
Group 1 gives you the value of each square-bracketed group, group 2 gives you the first item matched in each square-bracketed group and group 3 gives you all the subsequent items. You will have to look at the indexes of the captures to determine which item belongs to each square-bracketed group.
Here's another option that uses CaptureCollections (the only way to do this in a single regex). Where Phil Ross's answer does it all in one match operation, this one does multiple matches. This way, all the individual-item captures are properly grouped according to the bracket pairs where they were found.
string s = @"[1,2,3,4,5] [abc,ef,g] [0,2,4b,y7] ";
Regex r = new Regex(@"\[((?:([^,\[\]]+),?)*)\]");
int matchNum = 0;
foreach (Match m in r.Matches(s))
{
Console.WriteLine("Match {0}, Group 1: {1}", ++matchNum, m.Groups[1]);
int captureNum = 0;
foreach (Capture c in m.Groups[2].Captures)
{
Console.WriteLine(" Group 2, Capture {0}: {1}", ++captureNum, c);
}
}
output:
Match 1, Group 1: 1,2,3,4,5 Group 2, Capture 1: 1 Group 2, Capture 2: 2 Group 2, Capture 3: 3 Group 2, Capture 4: 4 Group 2, Capture 5: 5 Match 2, Group 1: abc,ef,g Group 2, Capture 1: abc Group 2, Capture 2: ef Group 2, Capture 3: g Match 3, Group 1: 0,2,4b,y7 Group 2, Capture 1: 0 Group 2, Capture 2: 2 Group 2, Capture 3: 4b Group 2, Capture 4: y7
You'd be better off using String.Split
on your groups to split them once you have the bracket-delimited groups.
\[(.*?)\]
will tell you what is between the brackets, but if you add:
\[(?<NumSequence>.*?)\]
This will assign a group which you can then reference.
EDIT I would then use Phil's Reg Ex as mine above shows how to assign a group.
I do not think that what you ask is possible to do in a single Regex. Your data seems to have a variable number of comma seperated entries between the brackets, and there are no Regex expressions with a variable number of capturing groups.
精彩评论