C# Regular Expression: How to extract a collection
I have collection in text file:
(Collection
(Item "Name1" 1 2 3)
(Item "Simple name2" 1 2 3)
(Item "Just name 3" 4 5 6))
Collection also could be empty:
(Collection)
The number of items is undefined. It could be one item or one hundred. By previous extraction I already have inner text between Collection element:
(Item "Name1" 1 2 3)(Item "Simple name2" 1 2 3)(Item "Just name 3" 4 5 6)
In the case of empty collection it will be empty string.
How could I parse this collection using .Net Regu开发者_开发知识库lar Expression?
I tried this:
string pattern = @"(\(Item\s""(?<Name>.*)""\s(?<Type>.*)\s(?<Length>.*)\s(?<Number>.*))*";
But the code above doesn't produce any real results.
UPDATE:
I tried to use regex differently:
foreach (Match match in Regex.Matches(document, pattern, RegexOptions.Singleline))
{
for (int i = 0; i < match.Groups["Name"].Captures.Count; i++)
{
Console.WriteLine(match.Groups["Name"].Captures[i].Value);
}
}
or
while (m.Success)
{
m.Groups["Name"].Value.Dump();
m.NextMatch();
}
Try
\(Item (?<part1>\".*?\")\s(?<part2>\d+)\s(?<part3>\d+)\s(?<part4>\d+)\)
this will create a collection of matches:
Regex regex = new Regex(
"\\(Item (?<part1>\\\".*?\\\")\\s(?<part2>\\d+)\\s(?<part3>\\d"+
"+)\\s(?<part4>\\d+)\\)",
RegexOptions.Multiline | RegexOptions.Compiled
);
//Capture all Matches in the InputText
MatchCollection ms = regex.Matches(InputText);
//Get the names of all the named and numbered capture groups
string[] GroupNames = regex.GetGroupNames();
// Get the numbers of all the named and numbered capture groups
int[] GroupNumbers = regex.GetGroupNumbers();
I think you might need to make your captures non-greedy...
(?<Name>.*?)
instead of
(?<Name>.*)
I think you should read file and than make use of Sting.Split function to split the collection and start to read it
String s = "(Collection
(Item "Name1" 1 2 3)
(Item "Simple name2" 1 2 3)
(Item "Just name 3" 4 5 6))";
string colection[] = s.Split('(');
if(colection.Length>1)
{
for(i=1;i<colection.Length;i++)
{
//process string one by one and add ( if you need it
//from the last item remove )
}
}
this will resolve issue easily there is no need of put extra burden of regulat expression.
精彩评论