How does CouchDB handle UTF-8?
I'm quite puzzled by CouchDB: if I send a PUT request with some JSON string fields encoded as UTF-8, the non 7 bit ASCII characters get converted to the "\uXXXX" escape sequence. Is there any way to tell it not to esc开发者_JAVA百科ape UNICODE?
Those \uXXXX
are the correct way of encoding UTF-8 characters in Javascript.
Considering CouchDB is accessed using JSON (i.e. Javascript data), those sequences should be interepreted when using the data, and this should not be a problem.
CouchDB use mochiweb to handle JSON encoding/decoding.
There is an argument do encoding routine witch tells to output without those \uXXXX
.
Simple way to apply patch is:
- get CouchDB source
- edit src/mochiweb/mochijson2.erl
- Find
-record(encoder, {handler=null, utf8=false}).
around line 45. - Change to
utf8=true
- make clean; make; make install
I found the discussion with Chris Anderson http://erlangine.feautec.pp.ru/?p=232 and it tells me there is a chance to get this behavior out of box if someone care to make proper patch to CouchDB.
精彩评论