JSON encoding wrongly escaped (Rails 3, Ruby 1.9.2)
In my controller, the following works (prints "oké")
puts obj.inspect
But this doesn't (renders "ok\u00e9")
render :json => obj
Apparently the to_json
method escapes unicode characters. Is there an option to pre开发者_开发百科vent this?
To set the \uXXXX codes back to utf-8:
json_string.gsub!(/\\u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")}
You can prevent it by monkey patching the method mentioned by muu is too short. Put the following into config/initializers/patches.rb (or similar file used for patching stuff) and restart your rails process for the change to take affect.
module ActiveSupport::JSON::Encoding
class << self
def escape(string)
if string.respond_to?(:force_encoding)
string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
end
json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
json = %("#{json}")
json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
json
end
end
end
Be adviced that there's no guarantee that the patch will work with future versions of ActiveSupport. The version used when writing this post is 3.1.3.
If you dig through the source you'll eventually come to ActiveSupport::JSON::Encoding
and the escape
method:
def escape(string)
if string.respond_to?(:force_encoding)
string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
end
json = string.
gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
gsub(/([\xC0-\xDF][\x80-\xBF]|
[\xE0-\xEF][\x80-\xBF]{2}|
[\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
}
json = %("#{json}")
json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
json
end
The various gsub
calls are forcing non-ASCII UTF-8 to the \uXXXX
notation that you're seeing. Hex encoded UTF-8 should be acceptable to anything that processes JSON but you could always post-process the JSON (or monkey patch in a modified JSON escaper) to convert the \uXXXX
notation to raw UTF-8 if necessary.
I'd agree that forcing JSON to be 7bit-clean is a bit bogus but there you go.
Short answer: no.
Characters were not escaped to unicode with the other methods in Rails2.3.11/Ruby1.8
so I used the following:
render :json => JSON::dump(obj)
That is the correct encoding. JSON doesn't requre Unicode characters to be escaped, but it is common for JSON libraries to produce output which contains only 7-bit ASCII characters, to avoid any potential encoding problems in transit.
Any JSON interpreter will be able to consume that string and reproduce the original. To see this in action, just type javascript:alert("ok\u00e9")
into your browser's location bar.
render :json will call .to_json on the object if it's not a string. You can avoid this problem by doing:
render :json => JSON.generate(obj)
This will by pass a string directly and therefore avoid the call to ActiveSupport's to_json.
Another approach would be to override to_json on the object you are serializing, so in that case, you could do something like:
class Foo < ActiveRecord::Base
def to_json(options = {})
JSON.generate(as_json)
end
end
And if you use ActiveModelSerializers, you can solve this problem by overriding to_json in your serializer:
# controller
respond_with foo, :serializer => MySerializer
# serializer
attributes :bar, :baz
def to_json(options = {})
JSON.generate(serializable_hash)
end
I have got a very tricky way to solve this problem. Well, if to_json
did not allow you to have the correct code, then you could directly try to write :
render text: tags
render json: tags
or render json: tags.to_json
will always auto transfer the encoding style, but if you use render text:tags
, then the string will stay as it is. And I think jQuery could still recognize the data.
精彩评论