Problem reading XML with Nokogiri
My Ruby script is supposed to read in an XML doc from a URL and check it for well-formedness, returning any errors. I have a sample bad XML document hosted with the following text (from the Nokogiri tutorial:
<?xml version="1.0"?>
<root>
<open>foo
<closed>bar</closed>
</root>
开发者_JS百科
My test script is as follows (url refers to the above xml file hosted on my personal server):
require 'nokogiri'
document = Nokogiri::XML(url)
puts document
puts document.errors
The output is:
<?xml version="1.0"?>
Start tag expected, '<' not found
Why is it only capturing the first line of the XML file? It does this with even with known good XML files.
It is trying to parse the url, not its content. Please, take into account that first parameter to Nokogiri::XML
must be a string containing the document or an IO
object since it is just a shortcut to Nokogiri::XML::Document.parse
as stated here.
EDIT: For reading from an uri
require 'open-uri'
open(uri).read
I'm not too sure what code you are using to actually output the contents of the XML. I only see error printing code. However, I have posted some sample code to effectively move through XML with Nokogiri below:
<item>
Something
</item>
<item>
Else
</item>
doc = Nokogiri::XML(open(url))
set = doc.xpath('//item')
set.each {|item| puts item.to_s}
#=> Something
#=> Else
In general, the tutorial here should help you.
if you are getting the xml from a Nokogiri xml already, then make sure you use '.to_s' before passing it to the XML function.
for example, xml = Nokogiri::XML(existing_nokogiri_xml_doc.to_s)
精彩评论