Java regex error
Whenever I enter the following...
Pattern pmessage = Pa开发者_运维百科ttern.compile("\s*\p{Alnum}[\p{Alnum}\s]*");
Matcher mmessage = pmessage.matcher(message);
Matcher msubject = pmessage.matcher(subject);
I get a Invalid Escape Sequence
error. Anyone have any idea why / how I fix this?
For a version of \p{Alpha}
that works on the Java native character set instead being stuck unsable to process anything else than legacy data from the 1960s, you need to use
alphabetics = "[\\pL\\pM\\p{Nl]";
For a version of numerics in the same sense, you have to choose which of these you want:
ASCII_digits = "[0-9]";
all_numbers = "\\pN";
decimal_numbers = "\\p{Nd}"
because which one applies various depending on circumstances. We’ll assume you copied one of those three to a numeric
variable.
Assuming you then want alphanumerics based on the definition above, you could then write:
alphanumerics = "[" + alphabetics + numerics + "]";
However, if what you mean by alphanumerics is the \w
sense of program identifiers, you have to add some stuff.
identifier_chars = "[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}[\\p{InEnclosedAlphanumerics}&&\\p{So}]]";
This issue is discussed at length in this answer, where you’ll also find a link to some alpha code of mine that does these transforms for you automatically. I hope to get a chance to rewrite it to take up less space this weekend.
Double each backslash: Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*")
Backslashes inside string literals have a special meaning, and have to be duplicated in order for the actual backslash character to become part of the string (which is what is required in your regex example.)
Keep in mind, that backslashes are special characters in Java strings, that need to be escaped with an additional backslash:
Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");
You didn't correctly escape your "\" characters : in java, "\s" will give you \s, so you should write :
Pattern.compile("\\s*\\p{Alnum}[\\p{Alnum}\\s]*");
精彩评论