How to parse this?
I need to parse out the string that has following structure
x:{a,b,c,}, y:开发者_开发百科{d,e,f} etc.
where all entries are numbers so it would look something like this
411:{1,2,3},241:{4,1,2} etc.
Forgot to mention: number of comma delimited entries in between {} has no upper limit but has to have at least one entry.
- I need to get the unique list of the numbers before :, in above case 411,241
Can this be done with regex and how?
Regex:
(?<1>[\d]+):{(?<2>\d+),(?<3>\d+),(?<4>\d+)}
For data:
411:{1,2,3},241:{4,1,2},314:{5,6,7}
will produce the following match/groups collections:
Match 0
Group 0: 411:{1,2,3}
Group 1: 411
Group 2: 1
Group 3: 2
Group 4: 3
Match 1
Group 0: 241:{4,1,2}
Group 1: 241
Group 2: 4
Group 3: 1
Group 4: 2
Match 2
Group 0: 314:{5,6,7}
Group 1: 314
Group 2: 5
Group 3: 6
Group 4: 7
You can use the following code:
string expression = "(?<1>[\d]*):{(?<2>\d),(?<3>\d),(?<4>\d)}";
string input = "411:{1,2,3},241:{4,1,2},314:{5,6,7}";
Regex re = new Regex(expression, RegexOptions.IgnoreCase);
MatchCollection matches = re.Matches(input);
for (int i = 0; i < matches.Count; i++)
{
Match m = matches[i];
// for i==0
// m.groups[0] == 411:{1,2,3}
// m.groups[1] == 411
// m.groups[2] == 1
// m.groups[3] == 2
// m.groups[4] == 4
}
Update Having trouble getting it to work with pure regex and variable number of items in the list - maybe someone else can chime in here. A simple solution would be:
string expression = "(?<1>[\d]+):{(?<2>[\d,?]+)}";
string input = "411:{1,2,3,4,5},241:{4,1,234}";
Regex re = new Regex(expression, RegexOptions.IgnoreCase);
MatchCollection matches = re.Matches(input);
for (int i = 0; i < matches.Count; i++)
{
Match m = matches[i];
// for i==0
// m.groups[0] == "411:{1,2,3}"
// m.groups[1] == "411"
// m.groups[2] == "1,2,3"
int[] list = m.Groups[1].Split(",");
// now list is an array of what was between the curly braces for this match
}
Match list for above:
Match 0
Group 0: 411:{1,2,3,4,5}
Group 1: 411
Group 2: 1,2,3,4,5
Match 1
Group 0: 241:{4,1,234}
Group 1: 241
Group 2: 4,1,234
Why do you want to do this with regex? I mean, you're querying the string for id's and given an id, want to retrieve it's values. I'd just break the string up and create a map structure that has the id as key, and a collection of numbers as their values.
If we consider x:{a,b,c} an element, the following would give you a list of matches with two named grounps: Outer and Inner. Outer being x, Inner being a,b,c.
(?<outer>\d+):\{(?<inner>\d+(,\d+)*)\}
Update
Here is a code sample:
String input = "411:{1,2,3},241:{4,1,2},45:{1},34:{1,34,234}";
String expr = @"(?<outer>\d+):\{(?<inner>\d+(,\d+)*)\}";
MatchCollection matches = Regex.Matches(input, expr);
foreach (Match match in matches)
{
Console.WriteLine("Outer: {0} Inner: {1}", match.Groups["outer"].Value, match.Groups["inner"]);
}
this string have the json format. so you can use Json.Net to parse it for you
Are you working with JSON? If so, you might want to check out the JavaScriptSerializer Class on MSDN,
http://msdn.microsoft.com/en-us/library/system.web.script.serialization.javascriptserializer.aspx
Here's an alternative without RegEx that will run faster.
This returns a Dictionary<Double, List<Double>>
....
public Dictionary<double, List<double>> Example()
{
String[] aSeparators = {"{", "},", ",", "}"};
String data = "411:{1,2,3},843:{6,5,4,3,2,1},241:{4,1,2}";
String[] bases = data.Split(aSeparators, StringSplitOptions.RemoveEmptyEntries);
Dictionary<double, List<double>> aDict = null;
double aHeadValue = 0;
List<Double> aList = null;
foreach (var value in bases)
{
if (value.EndsWith(":"))
{
if (aDict == null)
aDict = new Dictionary<double, List<double>>();
else
aDict.Add(aHeadValue, aList);
aHeadValue = Double.Parse(value.TrimEnd(':'));
aList = new List<Double>();
}
else
{
aList.Add(Double.Parse(value));
}
}
aDict.Add(aHeadValue, aList);
return aDict;
}
I think this might work, Pseudo-Code
foreach match in Regex.Matches(yourInputString, "[0-9]{3}:\{[0-9,]\},")
firstNumber = match.Value.Substring(0, 3)
numbers() = match.Value.Substring(4, match.Value.Length - 5).Split(",")
next
The first one is achievable with the following regex:
\d*(?=:)
精彩评论