Handling string encoding with the same code in Ruby 1.8 and 1.9
I've got a gem that's used a bunch of people using a bunch of different Ruby interpreters, and it includes what boils down to this code:
res = RestClient.post(...)
doc = REXML::Document.new(res).root
The content of res
is always UTF-8, and this works fine in Ruby 1.8, but it blows up under Ruby 1.9 if the response is not pure ASCII and the user's default encoding is not UTF-8.
Now, if I wanted to make this work on Ruby 1.9 alone, I'd just stick res.force_encoding('utf-8')
in there and be done with it, but that method is 1.9-only and then breaks under Ruby 1.8:
NoMethodError: undefined method `force_encoding' for #<String:0x101318178>
The best solution can come up with is this, which forces the systemwide default encoding to UTF-8:
Encoding.default_external = 'UTF-8' if defined? Encoding
Better ideas, or is this as good as it开发者_JAVA百科 gets? Will there be any negative impact on library users who are trying to use different encodings?
if res.respond_to?(:force_encoding)
new_contents = res.force_encoding("UTF-8")
else
new_contents = res
end
I'd do something like that for backwards compatibility.
I'm with Mike Lewis in using respond_to
, but don't do it on the variable res everywhere throughout your code.
I took a look at your code in gateway.rb and it looks like everywhere you are using res
, it gets set by a call to make_api_request
so you could add this before your return statement in that method:
doc = doc.force_encoding("UTF-8") if doc.respond_to?(:force_encoding)
Even if it's other places but it's not literally with every string you encounter, I'm sure you can find a way to refactor the code that makes sense and solves the problems in one place instead of everywhere you encounter it.
Are you having a problem with other places?
As far as I can see from the snippet, the cause of the problem is RestClient
, which doesn't return string in proper encoding (the one specified in HTTP response), so I'd first try to get that problem fixed. If that can't be done, then you could wrap RestClient
calls with your code that forces the encoding (the way Mike Lewis suggested). Or you are experiencing the problem on places other than RestClient
calls as well?
Does it work if you include an #encoding: utf-8
header in this particular file that uses this method.
Ruby 1.9 support different encodings throughout the application and should work fine if this content is utf-8 encoded.
Ruby 1.8 would simply ignore the #encoding
header and keep on working nicely.
It's a very simple approach but i believe it deserves a try!
精彩评论