开发者

C# regular expression question

Can anybody help me to form a regular expression to search the following string:

<b>The</b> <b>brown</b> <b>fox</b> jumped over the <b>lazy</b> <b>dog</b>.

The expression should match <b>The</b> <b>brown</b> <b>fox&l开发者_JAVA百科t;/b> as one match then proceed to match <b>lazy</b> <b>dog</b>. In this example, the expression should return two matches only, thanks.


Is this what you're looking for?

Regex r = new Regex(@"<b>[^<]*</b>(?:\s*<b>[^<]*</b>)*");

String input = @"<b>The</b> <b>brown</b> <b>fox</b> jumped over the <b>lazy</b> <b>dog</b>.";
foreach (Match m in r.Matches(input))
{
  Console.WriteLine(m.Value);
}

output:

<b>The</b> <b>brown</b> <b>fox</b>
<b>lazy</b> <b>dog</b>


This would work with your specific example:

@"The brown fox|lazy dog"

Furthermore, if you need to match any more simple phrases, just append |the simple phrase to this pattern.


The brown fox|lazy dog

The above is the regex that would generate two matches from your given input.


RegEx really isn't suited to parsing HTML. A much better solution would be to use the Html Agility Pack

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜