Simple C# Tokenizer Using Regex
I'm looking to tokenize really simple strings,but struggling to get the right Regex.
The strings might look like this:
string1 = "{[Surname]}, some text... {[FirstName]}"
string2 = "{Item}foo.{Item2}bar"
And I want to extract the tokens in the curly braces (so string1 gets "{[Surname]}","{[FirstName]}"
and string2 gets "{Item}"
and "{Item2}"
)
So basically, there's two different token types I want to extract: {[Foo]} and {Bar}.
this question is quite good, bu开发者_开发问答t I can't get the regex right: poor mans lexer for c# Thanks for the help!
They're both good answers guys, thanks. Here's what I settled for in the end:
// DataToken = {[foo]}
// FieldToken = {Bar}
string pattern = @"(?<DataToken>\{\[\w+\]\})|(?<FieldToken>\{\w+\})";
MatchCollection matches = Regex.Matches(expression.ExpressionString, pattern,
RegexOptions.ExplicitCapture);
string fieldToken = string.Empty;
string dataToken = string.Empty;
foreach (Match m in matches)
{
// note that EITHER fieldtoken OR DataToken will have a value in each loop
fieldToken = m.Groups["FieldToken"].Value;
dataToken = m.Groups["DataToken"].Value;
if (!string.IsNullOrEmpty(dataToken))
{
// Do something
}
if (!string.IsNullOrEmpty(fieldToken))
{
// Do something else
}
}
Unless rules are very convoluted, that will be (?<Token>\{\[.+?\]\})
for the first string and (?<Token>\{.+?\})
for the second
what about (?<token>\{[^\}]*\})
精彩评论