C# regular expression question
Can anybody help me to form a regular expression to search the following string:
<b>The</b> <b>brown</b> <b>fox</b> jumped over the <b>lazy</b> <b>dog</b>.
The expression should match <b>The</b> <b>brown</b> <b>fox&l开发者_JAVA百科t;/b>
as one match then proceed to match <b>lazy</b> <b>dog</b>
. In this example, the expression should return two matches only, thanks.
Is this what you're looking for?
Regex r = new Regex(@"<b>[^<]*</b>(?:\s*<b>[^<]*</b>)*");
String input = @"<b>The</b> <b>brown</b> <b>fox</b> jumped over the <b>lazy</b> <b>dog</b>.";
foreach (Match m in r.Matches(input))
{
Console.WriteLine(m.Value);
}
output:
<b>The</b> <b>brown</b> <b>fox</b>
<b>lazy</b> <b>dog</b>
This would work with your specific example:
@"The brown fox|lazy dog"
Furthermore, if you need to match any more simple phrases, just append |the simple phrase
to this pattern.
The brown fox|lazy dog
The above is the regex that would generate two matches from your given input.
RegEx really isn't suited to parsing HTML. A much better solution would be to use the Html Agility Pack
精彩评论