开发者

What should be the valid characters in usernames? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. 开发者_开发问答 Closed 10 years ago.

Many web based user authentication systems don't allow usernames that contain characters other than letters, numbers and underscores.

Could there be a technical reason for that?


A well-designed system doesn't necessarily need to prevent any special characters in usernames.

That said, the reason underscores have traditionally been accepted, is that underscore is typically treated as a "word" character, along with letters and numbers. It is usually the only other character given this distinction. This is true in regular expressions, and even at a base level in most operating systems (type an underscore in a word and double click the letters. The selection will extend past the underscore. Now try the same with a dash, it most likely will not.)


Yes: to avoid having to escape special characters. Lazy programmers will just drop what the user types, straight into the code somewhere and this is what leads to injection attacks.

Even if it's not used maliciously, allowing the user to type characters that will conflict somewhere else can be more hassle than necessary. For example, if you decide to create a filesystem directory per user, to store their uploads in, then the username must conform to directory naming rules on that OS (e.g. no \/:*?"<>| on Windows).

Once you've avoided clashes like the directory naming one, and stripped out "';% and // to avoid injection attacks, you have removed most punctuation, and "why does someone even need punctuation in their user name"?

It was far easier to write a quick regex to validate usernames against [a-zA-Z0-9_] and be done with it, than faff about with figuring out all the possible punctuation that will not clash, or mapping them to other characters in some way.

Then, like many things in computing, as soon as enough people start having just letters, numbers and underscores for usernames, and people start making usernames to that spec, it became the de facto standard and self perpetuates!


When not specified I use this:

(updated regex to fix the backtracking @abney317 mentioned)

^\w(?:\w|[.-](?=\w)){3,31}$

(original regex)

^\w(?:\w*(?:[.-]\w+)?)*(?<=^.{4,32})$

This requires a length of 4 with maximum 32 characters. It must start with a word character and can have non continuous dots and dashes. The only reason I use this is because it's strict enough to integrate with almost anything :)

Valid :

test.tost

Invalid :

test..tost


Limiting it to these characters (or even the ASCII subset of them) prevents usernames like

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜