开发者

RegEx To Match "whole word" returns exception

I am trying开发者_开发百科 to validate via RegEx as follows...

If Regex.IsMatch(Output, "\b" & "Serial)" & "\b") Then
'do something
end if

but i get this Argument exception

parsing "\bSerial)\b" - Too many )'s.

I do understand the error, but how should i modify the RegEx expression?

UPDATE. The word "Serial)" is generated dynamically. That means at least for me i could get another exception for another character too.


Assuming that's VB.Net, you need to escape the ):

If Regex.IsMatch(Output, "\b" & "Serial\)" & "\b") Then
    'do something
End If

In .Net regular expressions, parentheses are the grouping characters.


If, as you say, the word 'Serial)' is generated dynamically, you will have to escape it before passing it to the RE engine:

If Regex.IsMatch(Output, "\b" & Regex.Escape("Serial)") & "\b") Then
    'do something
End If

As one other answerer has posted, this won't match "Serial) xyz" (for example) since there's no \b between the ) and space (\b only exists between \w and \W characters and both ) and a space are \W).

You may have to resort to an ugly hack like:

If Regex.IsMatch(Output, "\s" & Regex.Escape("Serial)") & "\s") _
Or Regex.IsMatch(Output, "\s" & Regex.Escape("Serial)") & "$") _
Or Regex.IsMatch(Output, "^" & Regex.Escape("Serial)") & "\s") _
Or Regex.IsMatch(Output, "^" & Regex.Escape("Serial)") & "$") _
Then
    'do something
End If

I thought that maybe you could match a character class consisting of (^ or $) and \s along the lines of:

If Regex.IsMatch(Output, "[\s^]" & Regex.Escape("Serial)") & "[\s$]") Then
    'do something
End If

but this doesn't appear to work based on the regex tester here so you may have to go for the ugly hack version or you could combine them into a single regex as in:

var input = "Serial)"
var escaped = Regex.Escape (input)
var regex = "\s" & escaped & "\s|^" & escaped & "$|\s" & escaped & "$|^" & escaped & "\s"
If Regex.IsMatch(Output, regex) Then
    'do something
End If


Paxdiablo and tanascius' answers correctly explain why your regex fails to compile.

But:

You need to be careful with your regex, even after escaping the parenthesis: \b only matches at word boundaries (a word being constructed from characters of the \w shortcut - letters, digits, and underscore), not after punctuation like parentheses. In your case the regex will not match in a string like foo Serial) bar. It will match in foo Serial)bar, but only because the \b matches before bar. Likewise, it won't match the string Serial).

So, simply surrounding a string with \bs will not always do what you seem to be expecting it to do.

Edit: If, according to your comment below, in the following list...

foo Serial) bar
foo (Serial) bar
foo Serial). bar
foo Serial))))))
foo Serial)

...only the first and fifth should match, I'm inferring that the rule is to match a whole word only if it is preceded/followed by whitespace or start/end of string.

In that case, use

If Regex.IsMatch(Output, "(?<=^|\s)" & Regex.Escape("Serial)") & "(?=\s|$)") Then

However, this will now no longer match foo in This is foo. or He said "foo". If you want to allow this, use

If Regex.IsMatch(Output, "(?<=^|\b|\s)" & Regex.Escape("Serial)") & "(?=\s|\b|$)") Then

...but this will now also match the second example. Choose your weapon carefully :)

(Explanation: (?<=^|\b|\s) is a positive lookbehind assertion that matches if it is possible to match either the start of the string, a word boundary, or a whitespace character right before the current position, without adding anything to the match result. (?=\s|\b|$) is its lookahead counterpart.)


You should escape your input by using Regex.Escape():

String input = "Serial)";
If Regex.IsMatch(Output, "\b" & Regex.Escape( input ) & "\b") Then
  'do something
end if


I think what you need may be

\bSerial\)\b

(that's "\b" & "Serial)" & "\b" )


You need to escape the brackets. that is, ) with ) So, the final string should look like, \bSerial)\b

If the content generates dynamically, search for "(" and ")" and replace them with appropriate escape character(just a string replacement!) to "(" and ")" or use Regex.Escape() to escape those characters!

HTH

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜