开发者

What is a regex for Twitter-like names?

I have been coding for a while but never had the need for regular expressions until recently. I need to do a regular expression that accepts usernames as Twitter does. Basically, I want to allow one underscore at a time. There can be more than one underscore in a name but these should not be consecutive characters. Alphanumeric characters are also allowed. But numbers cannot start a name.

Names such as

are valid but

  • 94myname
  • __myname
  • my__name
  • my name

are not valid.

I have played with Rubular and come up with a couple regexes:

  • /^[^0-9\s+](_?[a-z0-9]+_?)+$/i
  • /^([a-z_?])+$/i

The problem I keep running into is that these match more than one underscores.


Edited

a = %w[
    _myname67
    myname67
    my_name
    _my_67_name_
    94myname
    __myname
    my__name
    my\ name
    m_yname
]

p a.select{|name| name =~ /\A_?[a-z]_?(?:[a-z0-9]_?)*\z/i}
# => ["_myname67", "myname67", "my_name", "_my_67_name_", "m_yname"]

You should use ( ) only for substrings that you want to capture. (?: ) is used for groupings that you do not want to capture. It is a good practice to use it whenever you do not need to refer particularly to that substring. It also makes the regex run faster.


Try the following ^([a-zA-Z](_?[a-zA-Z0-9]+)*_?|_([a-zA-Z0-9]+_?)*)$

I've separated two cases: the word starts with a letter, and starts with an underscore. If you don't want to allow names consisting of one symbol only replace the * with +.

maerics's solution has one problem, it doesn't capture names that have _ on the second place, such as m_yname


Some things are really hard to express using only regular expressions, and are generally write-only (that is, there's no way to read and understand them lately). You can use a simpler regexp (like the two ones you managed to write) and check for double underscores in your Ruby code. It doesn't hurt:

if username =~ /^[^0-9](_?[a-z0-9]+_?)+$/i and username.count('__') == 0 then ...


This seems to work:

/^(_|([a-z]_)|[a-z])([a-z0-9]+_?)*$/i

Updates: corrected for numeral constraints and case.


/^[A-Za-z_]([A-Za-z0-9]+_?)+$/


Some problems can't be solved with just one regular expression... especially when you want to check for the absence of a pattern as well as the presence of another pattern.

Sometimes it is better (and definitely more readable) to break your conditions down into multiple regular expressions and match against each of them in turn.

In addition to your regular expressions to check for valid characters, you should also use a regular expression to check for the presence of two underscores, and then INVERT that result (that is, throw out the name if it MATCHES the pattern).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜