开发者

Regular Expression to match two characters unless they're within two positions of another character

I'开发者_JAVA百科m trying to create a regular expression to match some certain characters, unless they appear within two of another character.

For example, I would want to match abc or xxabcxx but not tabct or txxabcxt.

Although with something like tabctxxabcxxtabcxt I'd want to match the middle abc and not the other two.

Currently I'm trying this in Java if that changes anything.


Try this:

String s = "tabctxxabcxxtabcxt";
Pattern p = Pattern.compile("t[^t]*t|(abc)");
Matcher m = p.matcher(s);
while (m.find())
{
  String group1 = m.group(1);
  if (group1 != null)
  {
    System.out.printf("Found '%s' at index %d%n", group1, m.start(1));
  }
}

output:

Found 'abc' at index 7

t[^t]*t consumes anything that's enclosed in ts, so if the (abc) in the second alternative matches, you know it's the one you want.


EDITED! It was way wrong before.

Oooh, this one's tougher than I thought. Awesome. Using fairly standard syntax:

[^t]{2,}abc[^t]{2,}

That will catch xxabcxx but not abc, xabc, abcx, xabcx, xxabc, xxabcx, abcxx, or xabcxx. Maybe the best thing to do would be:

if 'abc' in string:
    if 't' in string:
        return regex match [^t]{2,}abc[^t]{2,}
    else:
        return false
else:
    return false

Is that sufficient for your intention?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜