开发者

Java regex: check if word has non alphanumeric characters

This is my code to determine if a word contains any non-alphanumeric characters:

  Str开发者_开发知识库ing term = "Hello-World";
  boolean found = false;
  Pattern p = Pattern.Compile("\\W*");
  Matcher m = p.Matcher(term);
  if(matcher.find())
    found = true;

I am wondering if the regex expression is wrong. I know "\W" would matches any non-word characters. Any idea on what I am missing ??


Change your regex to:

.*\\W+.*


This is the expresion you are looking for:

"^[a-zA-Z0-9]+$"

When it evaluates to false that means does not match so that mean you found what you wanted.


It's 2016 or later and you should think about international strings from other alphabets than just Latin. The frequently cited [^a-zA-Z] will not match in that case. There are better ways in Java now:

[^\\p{IsAlphabetic}^\\p{IsDigit}]

See the reference (section "Classes for Unicode scripts, blocks, categories and binary properties"). There's also this answer that I found helpful.


Methods are in the wrong case.

The matcher was declared as m but used as matcher.

The repetition should be "one or many" + instead of "zero or many " * This works correctly:

String term = "Hello-World";
boolean found = false;
Pattern p = Pattern.compile("\\W+");//<-- compile( not Compile(
Matcher m = p.matcher(term);  //<-- matcher( not Matcher
if(m.find()) {  //<-- m not matcher
    found = true;
}

Btw, it would be enough if you just :

boolean found = m.find();

:)


The problem is the '*'. '*' matches ZERO or more characters. You want to match at least one non word character, so you must use '+' as the quantity modifier. Hence match \W+ (Capital W there for NON word)


Your expression does not take account of possible non-English letters. It's also more complicated than it needs to be. Unless you are using regexs for some reason other than need (such as your professor having told you to) you are much better off with:

boolean found = false;
for (int i=0;i<mystring.length();++i) {
  if (!Character.isLetterOrDigit(mystring.charAt(i))) {
    found=true;
    break;
  }
}


When I had to do this same thing the regex I use is "(\w)*" Thats what I use. Not sure if capitol w is the same but I also used parenthesis.


If you are okay to use Apache StringUtils, then it's as simple as following

StringUtils.isAlphanumeric(inp)


if (value.matches(".*[^a-zA-Z0-9].*")) { // tested, seems to work.
    System.out.println("match");
} else {
    System.out.println("no match");
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜