How to escape unicode characters to symbol entitiy names in python?
What I want to achieve is
Í -> í
ø -> &开发者_JAVA百科oslash;
ñ -> ñ
...
Is there a standard way for this in python or do I have to create my own dictionary and use it to escaape the characters manually?
I found a lot of hints for the other way around here on SO but none which answers my querstion.
You're looking for htmlentitydefs:
In [217]: import htmlentitydefs
In [224]: ['&'+htmlentitydefs.codepoint2name[ord(x)]+';' for x in u'Íøñ']
Out[224]: ['Í', 'ø', 'ñ']
Try this:
import htmlentitydefs
def EscapeUnicode(character):
return "&%s;" % htmlentitydefs.codepoint2name[ord(character)]
精彩评论