Validate a name in Python
For an internationalised project, I have to validate the global syntax for a name (first, l开发者_JAVA百科ast) with Python. But the lack of unicode classes support is really maling things harder.
Is there any regex / library to do that ?
Examples:
Björn, Anne-Charlotte, توماس, 毛, or מיק must be accepted. -Björn, Anne--Charlotte, Tom_ or entries like that should be rejected.
Is there any simple way to do that ?
Thanks.
Python does support unicode in regular expressions if you specify the re.UNICODE flag. You can probably use something like this:
r'^[^\W_]+(-[^\W_]+)?$'
Test code:
# -*- coding: utf-8 -*-
import re
names = [
u'Björn',
u'Anne-Charlotte',
u'توماس',
u'毛',
u'מיק',
u'-Björn',
u'Anne--Charlotte',
u'Tom_',
]
for name in names:
regex = re.compile(r'^[^\W_]+(-[^\W_]+)?$', re.U)
print u'{0:20} {1}'.format(name, regex.match(name) is not None)
Result:
Björn True Anne-Charlotte True توماس True 毛 True מיק True -Björn False Anne--Charlotte False Tom_ False
If you also want to disallow digits in names then change [^\W_]
to [^\W\d_]
in both places.
精彩评论