What does it take to get the "LyricArtist" from this XML feed using Nokogiri?
First the xml: http://api.chartlyrics.com/apiv1.asmx//GetLyric?lyricId=90&lyricCheckSum=9600c891e35f602eb6e1605fb7b5229e
doc = Nokogiri::XML(open("http://api.chartlyrics.com/apiv1.asmx//GetLyric?lyricId=90&lyricCheckSum=9600c891e35f602eb6e1605fb7b5229e"))
Successfully will grab the document content.
After this point i am unable to get inside and grab data and i am not sure开发者_如何学JAVA why?
For example, i would expect:
doc.xpath("//LyricArtist")
To kick back the artist but it does not.
I have tried the same thing with other feeds, such as the default RSS feed that any wordpress installation provides and if i do something like:
doc.xpath("//link")
I get a list of all the "links".
I am definitely missing something and would love your input. thank you!!
The XML elements are namespace qualified and bound to http://api.chartlyrics.com/
.
If you view the XML you will notice the document element has a namespace decalred:
<GetLyricResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://api.chartlyrics.com/">
In order to match on an element that is bound to a namespace, you either need to declare a namespace prefix bound to that URI and use that namespace prefix in your XPATH expression, or use an XPATH expression that either ignores the namespaces or matches differently.
You can match on elements and then use local-name()
to match the element name, regardless of the declared namespace.
//*[local-name()='LyricArtist']
If you want to be more exact, you can use local-name()
to match the element name and namespace-uri()
to match the declared namespace.
//*[local-name()='LyricArtist' and namespace-uri()='http://api.chartlyrics.com/']
The second example would prevent matching on elements with the same local-name()
that were bound to different namespaces. Might not be a problem for this specific instance, but is something that you should be aware of. Namespaces are used to uniquely qualify nodes and allow different vocabularies to use the same "name" for something without worrying about a conflict.
It doesn't like something in the namespace or schema.
uri = "http://api.chartlyrics.com/apiv1.asmx//GetLyric?LyricId=90&lyricCheckSum=9600c891e35f602eb6e1605fb7b5229e"
x = open(uri).read()
x = x.sub(/<.*?>/,'').sub(/<.*?>/,'<GetLyricResult>')
doc = Nokogiri::XML(x)
puts doc.xpath('//LyricArtist').text()
精彩评论