开发者

Ruby doesn't treat hebrew letters well

I am trying to read an XML which have also hebrew letters and its content is:

<?xml version="1.0" encoding="UTF-8"?>
<keywords type="array">
  <keyword>seo software</keyword>
  <keyword>ipad</keyword>
  <keyword>muffuletta manhattanization</keyword>
  <keyword>cheap motels</keyword>
  <keyword>שפות תכנות</keyword>
</keywords>

And my code to do it is:

# encoding: UTF-8 
def use
  #require "rexml/document"
  file = File.new( "sources/rankabove-te开发者_如何学Gost.xml" ) 
  puts file.read  
end

However, it doesn't help me, and the output of the 'puts' command is gibberish for the Hebrew letters:

╫⌐╫ñ╫ץ╫¬ ╫¬╫¢╫á╫ץ╫¬

I am using win xp 32 bit. Does anyone familiar with that problem? Anything I can do?


I don't think the problem is Ruby:

# encoding: UTF-8

puts RUBY_VERSION
# >> 1.9.2

xml = '
<?xml version="1.0" encoding="UTF-8"?>
<keywords type="array">
  <keyword>seo software</keyword>
  <keyword>ipad</keyword>
  <keyword>muffuletta manhattanization</keyword>
  <keyword>cheap motels</keyword>
  <keyword>שפות תכנות</keyword>
</keywords>
'

require 'nokogiri'

doc = Nokogiri::XML(xml)
puts doc.search('//keyword').last.text
# >> שפות תכנות

require "rexml/document"
require 'rexml/node'
require 'rexml/xpath'

doc = REXML::Document.new(xml)
puts REXML::XPath.match(doc, '//keyword').last.text
# >> שפות תכנות

Using both Nokogiri and REXML I get the same output on Mac OS.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜