开发者

Can Nokogiri search for "?xml-stylesheet" tags?

I need to parse for an XML style sheet:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/templates/xslt/inspections/开发者_StackOverflowdisclaimer_en.xsl"?>

Using Nokogiri I tried:

doc.search("?xml-stylesheet").first['href']

but I get the error:

`on_error': unexpected '?' after '' (Nokogiri::CSS::SyntaxError)


Nokogiri cannot search for tags that are XML processing instructions. You may access them like this:

doc.children[0]


This is not an XML element; this is an XML "Processing Instruction". That is why you could not find it with your query. To find it you want:

# Find the first xml-stylesheet PI
xss = doc.at_xpath('//processing-instruction("xml-stylesheet")')

# Find every xml-stylesheet PI
xsss = doc.xpath('//processing-instruction("xml-stylesheet")')

Seen in action:

require 'nokogiri'
xml = <<ENDXML
  <?xml version="1.0" encoding="UTF-8"?>
  <?xml-stylesheet type="text/xsl" href="/templates/disclaimer_en.xsl"?>
  <root>Hi Mom!</root>
ENDXML
doc = Nokogiri.XML(xml)
xss = doc.at_xpath('//processing-instruction("xml-stylesheet")')
puts xss.name     #=> xml-stylesheet
puts xss.content  #=> type="text/xsl" href="/templates/disclaimer_en.xsl"

Since a Processing Instruction is not an Element, it does not have attributes; you cannot, for example, ask for xss['type'] or xss['href']; you will need to parse the content as an element if you wish this. One way to do this is:

class Nokogiri::XML::ProcessingInstruction
  def to_element
    document.parse("<#{name} #{content}/>")
  end
end

p xss.to_element['href'] #=> "/templates/disclaimer_en.xsl"

Note that there exists a bug in Nokogiri or libxml2 which will cause the XML Declaration to appear in the document as a Processing Instruction if there is at least one character (can be a space) before <?xml. This is why in the above we search specifically for processing instructions with the name xml-stylesheet.

Edit: The XPath expression processing-instruction()[name()="foo"] is equivalent to the expression processing-instruction("foo"). As described in the XPath 1.0 spec:

The processing-instruction() test may have an argument that is Literal; in this case, it is true for any processing instruction that has a name equal to the value of the Literal.

I've edited the answer above to use the shorter format.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜