Python: getting \\u00bd correctly in editor
I would like to do the following: 1) Serialize my class 2) Also manually edit the serialization dump file to remove certain objects of my class which I find unnecessary.
I am currently using python with simplejson. As you know, simplejson converts all characters to unicde. As a result, when I dump a particular object with simplejson, the unicode characters becomes something like that "\u00bd" for 好.
I am interested to manually edit the simplejson file for convenience. Anyone here know 开发者_JS百科a work around for me to do this?
My requirements for this serialization format: 1) Easy to use (just dump and load - done) 2) Allows me to edit them manually without much hassle. 3) Able to display chinese character
I use vim. Does anyone know a way to conver "\u00bd" to 好 in vim?
I don't know anything about simplejson or the Serialisation part of the question, but you asked about converting "\u00bd" to 好 in Vim. Here are some vim tips for working with unicode:
You'll need the correct encoding set up in vim, see:
:help 'encoding' :help 'fileencoding'
Entering unicode characters by number is simply a case of going into insert mode, pressing Ctrl-V and then typing
u
followed by the four digit number (orU
followed by an 8-digit number). See::help i_CTRL-V_digit
Also bear in mind that in order for the character to display correctly in Vim, you'll need a fixed-width font containing that character. It appears as a wide space in Envy Code R and as various boxes in Lucida Console, Consolas and Courier New.
To replace
\uXXXX
with unicode characterXXXX
(where X is any hexadecimal digit), type this when in normal mode (where<ENTER>
means press the ENTER key, don't type it literally)::%s/\\u\x\{4\}/\=eval('"' . submatch(0) . '"')/g<ENTER>
Note however that u00bd appears to be unicode character ½ (1/2 in case that character doesn't display correctly on your screen), not the 好 character you mentioned (which is u597D I think). See this unicode table. Start vim and type these characters (where <Ctrl-V>
is produced by holding CTRL
, pressing V
, releasing V
and then releasing CTRL
):
i<Ctrl-V>u00bd
You should see a small character looking like 1/2, assuming your font supports that character.
If you want json
/simplejson
to produce unicode output instead of str output with Unicode escapes then you need to pass ensure_ascii=False
to dump()
/dumps()
, then either encode before saving or use a file-like from codecs
.
精彩评论