开发者

I need to upgrade one of my regular expressions

Currently i use the following regular expression to validate a textArea in JSF:

"^([a-zA-Z0-9]+[a-zA-Z0-9 ]+$)?"

It allows me to have multiple words and also uppercase and lower case characters, but still not enough, i need to make it better. It should also allow just a few special characters. Do you have any idea, how could i tune it to be able to:

-Allow the following 4 characters , . ; :

-Allow also special letters from a non english alphabet, This are the letters that are needed: Đ đ Ž ž Ć ć Č č Š š

I configured my web-app to use UTF-8, if the regular expresion could just allow those special letters, that would be great, because there would be less coding to validate each field every 开发者_开发问答time.


Just add them to the character-set marked with []

"^([a-zA-Z0-9,.;:ĐđŽžĆćČ芚]+[a-zA-Z0-9 ,.;:ĐđŽžĆćČ芚]+$)?"

Apart from your question, a suggestion for performance improvement: The first part is probably so the reg-exp may start with one of the allowed characters but space. As that is a special case for only the first character, remove the + sign. That way, it will match only the first character. Succeeding chars will be matched by the second part anyway.

"^([a-zA-Z0-9,.;:ĐđŽžĆćČ芚][a-zA-Z0-9 ,.;:ĐđŽžĆćČ芚]+$)?"


If the special characters are all from the same unicode bock you can match them with the expression \p{InGreek}, replacing Greek with the block the characters come from. You can also use a negative lookbehind to prevent matching a leading space. This would make the regex:

^(?! )[\p{Alnum}\p{InLatinExtendedA},.;: ]+$

If you'd rather not fail fast on a leading space, as your comments suggest, you can use this regex to trim leading and trailing whitespace as well:

^\s*([\p{Alnum}\p{InLatinExtendedA},.;: ]+?)\s*$

The first capturing group will be the valid string without leading or trailing whitespace.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜