开发者

Using if clause inside Regular Expression

I am currently coding a .net windows app using vb.net.

I am trying to pass a regular expression to Regex.Match to extract certain texts from an article. How do I write an if condition within a regular expression? I read t开发者_如何转开发his regular expression cheat sheet, according to which a condition can be stated using <?()>, but no example was given.

For example, I have following text:

"Mary have banana. Mary have apple. Mary have NO pear."

I can use the following expression to take out (1) banana, (2) apple, and (3) NO pear:

mary have (.+?\.)+?

But if I want to extract only the fruits that mary has, namely (1) banana and (2) apple, I guess I would need to add a condition in the (.+?\.)+? part, right? How do I list the condition in a regular expression?

Please assist, thank you!


Try this here:

Mary\shave\s(?!NO)(\S*)

You can try it online here: regexr.com?2thid

The first part is a negative lookahead assertion, that means this regex will not match if there is "Mary have NO". Otherwise it will put the word after "Mary have" into the first capturing group.

Here in the Perlretut (assuming its the same for .net) the condition part is explained, but I think my solution is simpler.


Others have provided solutions for your specific case, so I'll just focus on the "if clause" mentioned in the heading.

.NET supports conditionals using the following pattern.

(?(bob)[a-z]+|[0-9]+)

The regular expression will first try to match the text expression (the portion in the inner parentheses), if it matches then the over all expression will try to match using the sub expression before the pipe ([a-z]+) otherwise it will try to match using the sub expression after the pipe ([0-9]+).

Having said all that, I think the negative look ahead as suggested by stema would be a better fit for what you are trying to do.

Note: the "test" portion can also use any of the zero-width assertions such as the negative look behind.

(?(?<!\s)[a-z]+|[0-9]+)

Of-course a zero-width look ahead is redundant as the "test" expression is always considered zero-width.


Here is a solution that you can use without the hassle of regular expressions, but I can only answer in C#

    string sentence = "Mary have banana Mary have apple Mary have NO pear";
    if (sentence.Contains("banana"))
    {
        string x= sentence.Remove(sentence.IndexOf("banana"),"banana".Length);
    }

Don't laugh XD just a speedfix. Just rinse and repeat for the rest of the items


then try using the .Split() method. the split will probably look something like thisstring

sentence = "Mary have banana Mary have apple Mary have NO pear"; 
string[] brokenUp = sentence.Split(
      new String[] 
      { 
          "first fruit as string variable", 
          "second fruit as string variable", 
          "third fruit as string variable" 
      }, 
      StringSplitOptions.None
);
string newSentence = null;
for (int i = 0; i < brokenUp.Length; i++)
{
    newSentence += brokenUp[i];
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜