Android - Java - Regular Expression question - consecutive words not being matched
For my example I am trying to replace ALL cases of "the" and "a" in a string with a space. Including cases where these words are next to characters such as quotes and other punctuation
String oldString = "A test of the exp."
Pattern p = Pattern.compile("(((\\W|\\A)the(\\W|\\Z))|((\\W|\\A)a(\\W|\\Z)))",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(oldString);
newString = m.replaceAll(" ");
"A test of the exp." returns "test of exp." - Yeah!
"A test of the a exp." returns "test of a exp." - Boooo!
"The a in this test is a the." returns "a in this test is the. - DoubleBoooo!
Any help would be grea开发者_StackOverflowtly appreciated. Thanks!
String resultString = subjectString.replaceAll("\\b(?:a|the)\\b", " ");
\b
matches at a word boundary (i. e. at the start or end of a word, where "word" is a sequence of alphanumeric characters).
(?:...)
is a non-capturing group, needed to separate the alternative words (in this case a
and the
) from the surrounding word boundary anchors.
Or per simplified @Robokop soln.
Pattern.compile("(\\b(the|a)\\b)",Pattern.CASE_INSENSITIVE);
or
Pattern.compile('\b(the|a)\b',Pattern.CASE_INSENSITIVE);
Not sure about quoting in Java.
Pattern.compile("(\\bthe\\b)|(\\ba\\b)",Pattern.CASE_INSENSITIVE);
精彩评论