开发者

parsing out the html doctype tag in Nokogiri

How can I parse out the doctype tag to get the html version from a html file?

Trying to use doctype(or DOCTYPE or !DOCTYPE) 开发者_开发技巧as an argument in xpath raises an invalide expression error.


The doctype is not part of the document, but part of its DTD

require 'rubygems'
require 'nokogiri'

html = <<EOF
<!DOCTYPE foo PUBLIC "bar" "qux">
<html>
</html>
EOF

doc = Nokogiri::HTML(html)

puts doc.internal_subset.name
puts doc.internal_subset.external_id
puts doc.internal_subset.system_id
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜