.NET Regex for "not this string"
I'm a regex newbie and need a single expression that:
matches the "an" and the "AN" but not the "and" or "AND" and matches the "o" and the "O" but not the "or" or "OR" in this predicate:
1and(2or3)AND(4OR5)an(6o7)AN(8O9)
Basically I can't figure out how to convert the expression:
var myRegEx = Regex("[0-9 ()]|AND|OR")
into a "everything but", case insensitive expression.
Can't use the regex word boundaries feature because the predicate isn't required to have spaces.
(Added after two answers were already provided): I also need to know the index of the match, which is why I'm assuming I need to use the Regex.Match() method.
Thanks!
Here's what I ended up with:
private bool mValidateCharacters()
{
const string legalsPattern = @"[\d ()]|AND|OR";
const string splitPattern = "(" + legalsPattern + ")";
int position = 0;
string[] tokens = Regex.Split(txtTemplate.Text, splitPattern, RegexOptions.IgnoreCase);
// Array contains every legal operator/symbol found in the entry field
//开发者_运维问答 and every substring preceeding, surrounded by, or following those operators/symbols
foreach (string token in tokens)
{
if (string.IsNullOrEmpty(token))
{
continue;
}
// Determine if the token is a legal operator/symbol or a syntax error
Match match = Regex.Match(token, legalsPattern, RegexOptions.IgnoreCase);
if (string.IsNullOrEmpty(match.ToString()))
{
const string reminder =
"Please use only the following in the template:" +
"\n\tRow numbers from the terms table" +
"\n\tSpaces" +
"\n\tThese characters: ( )" +
"\n\tThese words: AND OR";
UserMsg.Tell("Illegal template entry '" + token + "'at position: " + position + "\n\n" + reminder, UserMsg.EMsgType.Error);
txtTemplate.Focus();
txtTemplate.Select(position, token.Length);
return false;
}
position += token.Length;
}
return true;
}
Randal Schwartz's rule: Use capturing in Regex.Match
when you know what you want to keep, and use Regex.Split
when you know what you want to throw away.
You wrote you want “everything but,” so
var input = "1and(2or3)AND(4OR5)an(6o7)AN(8O9)";
foreach (var s in Regex.Split(input, @"[\d()]|AND|OR", RegexOptions.IgnoreCase))
if (s.Length > 0)
Console.WriteLine("[{0}]", s);
Output:
[an] [o] [AN] [O]
To get the offsets, save the separators by enclosing the regular expression in parentheses:
var input = "1and(2or3)AND(4OR5)an(6o7)AN(8O9)";
string pattern = @"([\d()]|AND|OR)";
int offset = 0;
foreach (var s in Regex.Split(input, pattern, RegexOptions.IgnoreCase)) {
if (s.ToLower() == "an" || s.ToLower() == "o")
Console.WriteLine("Found [{0}] at offset {1}", s, offset);
offset += s.Length;
}
Output:
Found [an] at offset 19 Found [o] at offset 23 Found [AN] at offset 26 Found [O] at offset 30
精彩评论