Regex to match all of a set except certain ones
I'm sure this has been asked before, but I can't seem to find it (or know the proper wording to search for)
Basically I want a regex tha开发者_如何学Got matches all non-alphanumeric except hyphens. So basically match \W+ except exclude '-' I'm not sure how to exclude specific ones from a premade set.
\W
is a shorthand for [^\w]
. So:
[^\w-]+
A bit of background:
[…]
defines a set[^…]
negates a set- Generally, every
\v
(smallcase) set is negated by a\V
(uppercase) where V is any letter that defines a set. - for international characters, you may want to look into
[[:alpha:]]
and[[:alnum:]]
[^\w-]+
will do just that. Match any characters not in the \w
set except hyphen.
You can use:
[^a-zA-Z0-9_-]
or
[^\w-]
to match a single non-hyphen, non-alphanumeric. To match one or more of then prefix with a +
In Java7 or above, you need to prepend the (?U)
to match all locale specific characters. e.g.
(?U)[^\w-]
In a Java string (you need to escape \
character with another one):
(?U)[^\\w-]
精彩评论