How to modify this regular expression to be case insensitive while searching for curse words?
At the moment, this profanity filter finds darn
and golly
but not Darn
or Golly
or DARN
or GOLLY
.
List<String> bannedWords = Arrays.asList("darn", "开发者_开发知识库golly", "gosh");
StringBuilder re = new StringBuilder();
for (String bannedWord : bannedWords)
{
if (re.length() > 0)
re.append("|");
String quotedWord = Pattern.quote(bannedWord);
re.append(quotedWord);
}
inputString = inputString.replaceAll(re.toString(), "[No cursing please!]");
How can it be modified to be case insensitive?
Start the expression with (?i)
.
I.e., change re.toString()
to "(?i)" + re.toString()
.
From the documentation of Pattern
(?idmsux-idmsux)
Nothing, but turns match flagsi d m s u x
on - off
where i
is the CASE_INSENSITIVE
flag.
You need to set the CASE_INSENSITIVE
flag, or simply add (?i)
to the beginning of your regex.
StringBuilder re = new StringBuilder("(?i)");
You'll also need to change your conditional to
if (re.length() > 4)
Setting the flag via @ratchetFreak's answer is probably best, however. It allows for your condition to stay the same (which is more intuitive) and gives you a clear idea of what's going on in the code.
For more info, see this question and in particular this answer which gives some decent explanation into using regex's in java.
use a precompiled java.util.regex.Pattern
Pattern p = Pattern.compile(re.toString(),Pattern.CASE_INSENSITIVE);//do this only once
inputString = p.matcher(inputString).replaceAll("[No cursing please!]");
精彩评论