Parsing XML with Nokogiri
I can not figure out how to parse the "author" and "fact" tags out of the following XML. If the formatting looks strange here is a link to the XML doc.
<response stat="ok">
−<ltml version="1.1">
−<item id="5403381" type="work">
<author id="21" authorcode="rowlingjk">J. K. Rowling</author>
<url>http://www.librarything.com/work/5403381</url>
−<commonknowledge>
−<fieldList>
−<field type="42" name="alternativetitles" displayName="Alternate titles">
−<versionList>
−<version id="3413291" archived="0" lang="eng">
<date timestamp="1298398701">Tue, 22 Feb 2011 13:18:21 开发者_运维知识库-0500</date>
−<person id="18138">
<name>ablachly</name>
<url>http://www.librarything.com/profile/ablachly</url>
</person>
−<factList>
<fact>Harry Potter and the Sorcerer's Stone </fact>
</factList>
</version>
</versionList>
</field>
So far I have tried this code to get the author but it does not work:
@xml_doc = Nokogiri::XML(open("http://www.librarything.com/services/rest/1.1/?method=librarything.ck.getwork&isbn=0590353403&apikey=d231aa37c9b4f5d304a60a3d0ad1dad4"))
@xml_doc.xpath('//response').each do |n|
@author = n
end
I couldn't get at any nodes deeper than //response
using the link you provided. I ended up using Nokogiri::XML::Reader
and pushing elements into a hash, since there may be multiple authors, and there are definitely multiple facts. You can use whatever data structure you like, but this gets the content of the fact
and author
tags:
require 'nokogiri'
require 'open-uri'
url = "http://www.librarything.com/services/rest/1.1/?method=librarything.ck.getwork&isbn=0590353403&apikey=d231aa37c9b4f5d304a60a3d0ad1dad4"
reader = Nokogiri::XML::Reader(open(url))
book = {
author: []
fact: []
}
reader.each do |node|
book.each do |k,v|
if node.name == k.to_s && !node.inner_xml.empty?
book[k] << node.inner_xml
end
end
end
You could try:
nodes = @xml_doc.xpath("//xmlns:author", "xmlns" => "http://www.librarything.com/")
puts nodes[0].inner_text
nodes = @xml_doc.xpath("//xmlns:fact", "xmlns" => "http://www.librarything.com/")
nodes.each do |n|
puts n.inner_text
end
The trick is in the namespace.
精彩评论