开发者

Problem reading XML with Nokogiri

My Ruby script is supposed to read in an XML doc from a URL and check it for well-formedness, returning any errors. I have a sample bad XML document hosted with the following text (from the Nokogiri tutorial:

<?xml version="1.0"?>
  <root>
    <open>foo
      <closed>bar</closed>
  </root>
开发者_JS百科

My test script is as follows (url refers to the above xml file hosted on my personal server):

require 'nokogiri'

document = Nokogiri::XML(url) 

puts document
puts document.errors

The output is:

<?xml version="1.0"?>
Start tag expected, '<' not found

Why is it only capturing the first line of the XML file? It does this with even with known good XML files.


It is trying to parse the url, not its content. Please, take into account that first parameter to Nokogiri::XML must be a string containing the document or an IO object since it is just a shortcut to Nokogiri::XML::Document.parse as stated here.

EDIT: For reading from an uri

require 'open-uri'
open(uri).read


I'm not too sure what code you are using to actually output the contents of the XML. I only see error printing code. However, I have posted some sample code to effectively move through XML with Nokogiri below:

<item>
  Something
</item> 
<item>
  Else
</item>

doc = Nokogiri::XML(open(url))
set = doc.xpath('//item')
set.each {|item| puts item.to_s}
  #=> Something
  #=> Else

In general, the tutorial here should help you.


if you are getting the xml from a Nokogiri xml already, then make sure you use '.to_s' before passing it to the XML function.

for example, xml = Nokogiri::XML(existing_nokogiri_xml_doc.to_s)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜