Problem with RegEx OR operator in C#
I want to match a pattern [0-9][0-9]KK[a-z][a-z]
which is not prece开发者_运维知识库ded by either of these words
http://
example
I have a RegEx which takes care of the first criteria, but not the second criteria.
Without OR operator
var body = Regex.Replace(body, "(?<!http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%
\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?)([0-9][0-9]KK[a-z][a-z])
(?!</a>)","replaced");
wth OR Operator
var body = Regex.Replace(body, "(?example)|(?<!http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@
\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?)([0-9][0-9]KK[a-
z][a-z])(?!</a>)","replaced");
The second one with OR operator throws an exception. How can I fix this?
It should not match either of these:
example99KKas
http://stack.com/99KKas
Here is one way to do it. Start at the beginning of the string and check that each character is not the start of 'http://'
or 'example'
. Do this lazily, and one character at a time so that we can spot the magic word once we reach it. Also, capture everything up to the magic word so that we can put it back in the replacement string. Here it is in commented free-spacing mode so that it can be comprehended by mere mortals:
var body = Regex.Replace(body,
@"# Match special word not preceded by 'http://' or 'example'
^ # Anchor to beginning of string
(?i) # Set case-insensitive mode.
( # $1: Capture everything up to special word.
(?: # Non-capture group for applying * quantifier.
(?!http://) # Assert this char is not start of 'http://'
(?!example) # Assert this char is not start of 'example'
. # Safe to match this one acceptable char.
)*? # Lazily match zero or more preceding chars.
) # End $1: Everything up to special word.
(?-i) # Set back to case-sensitive mode.
([0-9][0-9]KK[a-z][a-z]) # $2: Match our special word.
(?!</a>) # Assert not end of Anchor tag contents.
",
"$1replaced",
RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);
Note that this is case sensitive for the magic word but not for http://
and example
. Note also that this is untested (I don't know C# - just its regex engine). The "var"
in "var body = ..."
looks kinda suspicious to me. ??
I wasn't able to get the second example working, it gave an ArgumentException of "Unrecognized grouping construct".
But I replaced the url matching and moved the first alternative group a bit and came up with this:
var body = Regex.Replace(body, "(?<!http\\://[a-zA-Z0-9\\-\\.]+\\.[a-zA-Z]{2,3}(/\\S*)?|example)
([0-9][0-9]KK[a-z][a-z])(?!</a>)","replaced");
You could use something like this:
body = Regex.Replace(body, @"(?<!\S)(?!(?i:http://|example))\S*\d\dKK[a-z]{2}\b", "replaced");
精彩评论