开发者

How to match characters from all languages, except the special characters in ruby

I have a display name field which I have to validate using Ruby regex. 开发者_Python百科We have to match all language characters like French, Arabic, Chinese, German, Spanish in addition to English language characters except special characters like *()!@#$%^&.... I am stuck on how to match those non-Latin characters.


There are two possibilities:

  • Create a regex with a negated character class containing every symbol you don't want to match:

    if ( name ~= /[^*!@%\^]/ ) # add everything and if this matches you are good
    

    This solution may not be feasible, since there is a massive amount of symbols you'd have to insert, even if you were just to include the most common ones.


  • Use Oniguruma (see also: Oniguruma for Ruby main). This supports Unicode and their properties; in which case all letters can be matched using:

    if ( name ~= /[\pL\pM]/ )
    

    You can see what these are all about here: Unicode Regular Expressions


Starting from Ruby 1.9, the String and Regex classes are unicode aware. You can safely use the Regex word character selector \w

"可口可樂!?!".gsub /\w/, 'Ha'
#=> "HaHaHaHa!?!"


In ruby > 1.9.1 (maybe earlier) one can use \p{L} to match word characters in all languages (without the oniguruma gem as described in a previous answer).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜