Regular Expression :match string containing only non repeating words
I have this situation(Java code): 1) a string such as : "A wild adventure" should match. 2) a string with adjacent repeated words: "A wild wild adventure" shouldn't match.
With this regular expression: .* \b(\w+)\b\s*\1\b.* i can match strings containing adjacent repeated words.
How to reverse the situation i.e how to match strings which do not contain adjacent repeat word开发者_如何学JAVAs
Use negative lookahead assertion, (?!pattern)
.
String[] tests = {
"A wild adventure", // true
"A wild wild adventure" // false
};
for (String test : tests) {
System.out.println(test.matches("(?!.*\\b(\\w+)\\s\\1\\b).*"));
}
Explanation courtesy of Rick Measham's explain.pl
:
REGEX: (?!.*\b(\w+)\s\1\b).*
NODE EXPLANATION
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1
or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
\1 what was matched by capture \1
--------------------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
See also
- regular-expressions.info/Lookarounds
Related questions
- using regular expression in Java
- Uses negative lookahead to ensure a string doesn't have a character occuring more than once
- Java split is eating my characters.
- Many examples of using assertions
- How do I convert CamelCase into human-readable names in Java?
- Very instructive example of using lookarounds
Note
Negative assertions only make sense when there are also other patterns that you want to positively match (see examples above). Otherwise, you can just use boolean complement operator !
to negate matches
with whatever pattern you were using before.
String[] tests = {
"A wild adventure", // true
"A wild wild adventure" // false
};
for (String test : tests) {
System.out.println(!test.matches(".*\\b(\\w+)\\s\\1\\b.*"));
}
精彩评论