Ruby: ARGV breaks accented characters
# encoding: utf-8
foo = "Résumé"
p foo
> "Résumé"
# encoding开发者_运维百科: utf-8
ARGV.each do |argument|
p argument
end
test.rb Résumé > "R\xE9sum\xE9"
Why does this occur, and how can I get ARGV to return "Résumé"?
I have chcp 65001 set already and am using ruby 1.9.2p290 (2011-07-09) [i386-mingw32]
EDIT After asking around on irc, I was instructed to do chcp 1252>NUL
which fixed the problem.
For some reason, Windows doesn't use UTF-8 in your console. So, although Ruby expects UTF-8 encoded string, it gets Windows-1252 encoded string.
So you have several possibilities (which I can't test as I, fortunately, don't use Windows):
- Persuade Windows to use UTF-8 in your console. I don't know if
chcp
should work and, if so, why it doesn't. - Tell Ruby to use Windows-1252 instead of UTF-8 as default
- Convert ARGV from Windows-1252 to UTF-8 manually:
Example:
>> argument = "R\xE9sum\xE9"
=> "R\xE9sum\xE9"
>> argument.force_encoding('windows-1252').encode('utf-8')
=> "Résumé"
精彩评论