开发者

Any python/django function to check whether a string only contains characters included in my database collation?

As expected, I get an error when entering some characters not included in my database collation:

(1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COE开发者_运维百科RCIBLE) for operation '='")

Is there any function I could use to make sure a string only contains characters existing in my database collation?

thanks


You can use a regular expression to only allow certain characters. The following allows only letters, numbers and _(underscore), but you can change to include whatever you want:

import re

exp = '^[A-Za-z0-9_]+$'
re.match(exp, my_string)

If an object is returned a match is found, if no return value, invalid string.


I'd look at Python's unicode.translate() and codec.encode() functions. Both of these would allow more elegant handling of non-legal input characters, and IIRC, translate() has been shown to be faster than a regexp for similar use-cases (should be easy to google the findings).

From Python's docs:

"For Unicode objects, the translate() method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted. Note, a more flexible approach is to create a custom character mapping codec using the codecs module (see encodings.cp1251 for an example)."

http://docs.python.org/library/stdtypes.html

http://docs.python.org/library/codecs.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜