开发者

JSON encoding wrongly escaped (Rails 3, Ruby 1.9.2)

In my controller, the following works (prints "oké")

puts obj.inspect

But this doesn't (renders "ok\u00e9")

render :json => obj

Apparently the to_json method escapes unicode characters. Is there an option to pre开发者_开发百科vent this?


To set the \uXXXX codes back to utf-8:

json_string.gsub!(/\\u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")}


You can prevent it by monkey patching the method mentioned by muu is too short. Put the following into config/initializers/patches.rb (or similar file used for patching stuff) and restart your rails process for the change to take affect.

module ActiveSupport::JSON::Encoding
  class << self
    def escape(string)
      if string.respond_to?(:force_encoding)
        string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
      end
      json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
      json = %("#{json}")
      json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
      json
    end
  end
end

Be adviced that there's no guarantee that the patch will work with future versions of ActiveSupport. The version used when writing this post is 3.1.3.


If you dig through the source you'll eventually come to ActiveSupport::JSON::Encoding and the escape method:

def escape(string)
  if string.respond_to?(:force_encoding)
    string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
  end
  json = string.
    gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
    gsub(/([\xC0-\xDF][\x80-\xBF]|
           [\xE0-\xEF][\x80-\xBF]{2}|
           [\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
    s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
  }
  json = %("#{json}")
  json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
  json
end

The various gsub calls are forcing non-ASCII UTF-8 to the \uXXXX notation that you're seeing. Hex encoded UTF-8 should be acceptable to anything that processes JSON but you could always post-process the JSON (or monkey patch in a modified JSON escaper) to convert the \uXXXX notation to raw UTF-8 if necessary.

I'd agree that forcing JSON to be 7bit-clean is a bit bogus but there you go.

Short answer: no.


Characters were not escaped to unicode with the other methods in Rails2.3.11/Ruby1.8 so I used the following:

render :json => JSON::dump(obj)


That is the correct encoding. JSON doesn't requre Unicode characters to be escaped, but it is common for JSON libraries to produce output which contains only 7-bit ASCII characters, to avoid any potential encoding problems in transit.

Any JSON interpreter will be able to consume that string and reproduce the original. To see this in action, just type javascript:alert("ok\u00e9") into your browser's location bar.


render :json will call .to_json on the object if it's not a string. You can avoid this problem by doing:

render :json => JSON.generate(obj)

This will by pass a string directly and therefore avoid the call to ActiveSupport's to_json.

Another approach would be to override to_json on the object you are serializing, so in that case, you could do something like:

class Foo < ActiveRecord::Base
  def to_json(options = {})
    JSON.generate(as_json)
  end
end

And if you use ActiveModelSerializers, you can solve this problem by overriding to_json in your serializer:

# controller
respond_with foo, :serializer => MySerializer

# serializer
attributes :bar, :baz

def to_json(options = {})
  JSON.generate(serializable_hash)
end


I have got a very tricky way to solve this problem. Well, if to_json did not allow you to have the correct code, then you could directly try to write :

render text: tags

render json: tags or render json: tags.to_json will always auto transfer the encoding style, but if you use render text:tags, then the string will stay as it is. And I think jQuery could still recognize the data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜