Ruby doesn't treat hebrew letters well
I am trying to read an XML which have also hebrew letters and its content is:
<?xml version="1.0" encoding="UTF-8"?>
<keywords type="array">
<keyword>seo software</keyword>
<keyword>ipad</keyword>
<keyword>muffuletta manhattanization</keyword>
<keyword>cheap motels</keyword>
<keyword>שפות תכנות</keyword>
</keywords>
And my code to do it is:
# encoding: UTF-8
def use
#require "rexml/document"
file = File.new( "sources/rankabove-te开发者_如何学Gost.xml" )
puts file.read
end
However, it doesn't help me, and the output of the 'puts' command is gibberish for the Hebrew letters:
╫⌐╫ñ╫ץ╫¬ ╫¬╫¢╫á╫ץ╫¬
I am using win xp 32 bit. Does anyone familiar with that problem? Anything I can do?
I don't think the problem is Ruby:
# encoding: UTF-8
puts RUBY_VERSION
# >> 1.9.2
xml = '
<?xml version="1.0" encoding="UTF-8"?>
<keywords type="array">
<keyword>seo software</keyword>
<keyword>ipad</keyword>
<keyword>muffuletta manhattanization</keyword>
<keyword>cheap motels</keyword>
<keyword>שפות תכנות</keyword>
</keywords>
'
require 'nokogiri'
doc = Nokogiri::XML(xml)
puts doc.search('//keyword').last.text
# >> שפות תכנות
require "rexml/document"
require 'rexml/node'
require 'rexml/xpath'
doc = REXML::Document.new(xml)
puts REXML::XPath.match(doc, '//keyword').last.text
# >> שפות תכנות
Using both Nokogiri and REXML I get the same output on Mac OS.
精彩评论