Regular Expression to match two characters unless they're within two positions of another character
I'开发者_JAVA百科m trying to create a regular expression to match some certain characters, unless they appear within two of another character.
For example, I would want to match abc or xxabcxx but not tabct or txxabcxt.
Although with something like tabctxxabcxxtabcxt I'd want to match the middle abc and not the other two.Currently I'm trying this in Java if that changes anything.
Try this:
String s = "tabctxxabcxxtabcxt";
Pattern p = Pattern.compile("t[^t]*t|(abc)");
Matcher m = p.matcher(s);
while (m.find())
{
String group1 = m.group(1);
if (group1 != null)
{
System.out.printf("Found '%s' at index %d%n", group1, m.start(1));
}
}
output:
Found 'abc' at index 7
t[^t]*t
consumes anything that's enclosed in t
s, so if the (abc)
in the second alternative matches, you know it's the one you want.
EDITED! It was way wrong before.
Oooh, this one's tougher than I thought. Awesome. Using fairly standard syntax:
[^t]{2,}abc[^t]{2,}
That will catch xxabcxx but not abc, xabc, abcx, xabcx, xxabc, xxabcx, abcxx, or xabcxx. Maybe the best thing to do would be:
if 'abc' in string:
if 't' in string:
return regex match [^t]{2,}abc[^t]{2,}
else:
return false
else:
return false
Is that sufficient for your intention?
精彩评论