开发者

Regex to match a word NOT within a specific number of words of another word

Hope I can explain this one.

I've got a regex for matching two words near each other. For example, if I wan开发者_运维问答t to find the word "account" and "number" within 5 words of each other:

\baccount\W+(?:\w+\W+){1,6}?number\b

This works perfectly.

Now I need to find a way to search for a word as long as it is NOT within 2 words of another word.

For example, I need a regex that matches "Butthead" but only if "Beavis" is not within 2 words, either BEFORE OR AFTER Butthead.

So Butthead and Beavis would not match. Beavis and Butthead would not match. But Beavis Sure Is a Giant Butthead would match because Beavis and Butthead are NOT within 2 words.


This should work if your regexp system supports variable length negative look behinds. I do not think many regex engines support this yet. I know that perl and php do not yet support this. I was not able to test since I use perl and php for my regex testing.

/(?<!beavis(?:\s+\w+)?\s+)butthead(?!(?:\s+\w+)?beavis)/


Can't you just do two matches? Match to find the occurence of the word anywhere (easy) then discard that match if the word is not near the other word (you already have a solution for that).


((?!((\Butthead\W+(?:\w+\W+){1,2}?Beavis\b)|(\Beavis\W+(?:\w+\W+){1,2}?Butthead\b))).)*

maybe something like this... didn't try it though... basically i've tryed your way using the following logic: NOT( (contains Butthead 2 words Beavis) OR (contains Beavis 2 words Butthead) )

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜