How to use Nokogiri's xpath and at_xpath methods

2022-12-17 16:41 问答作者：

I'm learning how to use Nokogiri and few questions came to me based on this code:

require 'rubygems'
require 'mechanize'

post_agent = WWW::Mechanize.new
post_page = post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')

puts "\nabsolute path with tbody gives nil"
puts  post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]').xpath('text()').to_s.strip.inspect

puts "\n.at_xpath gives an empty string"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").at_xpath('text()').to_s.strip.inspect

puts "\ntwo lines solution with .at_xpath gives an empty string"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].at_xpath('text()').to_s.strip.inspect


puts
puts "two lines working code"
rows =   post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")
puts rows[0].xpath('text()').to_s.strip

puts "\none line working code"
puts post_page.parser.xpath("//div[@id='posts']/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip

puts "\nanother one line code"
puts post_page.parser.at_xpath("//div[@id='posts']/div/table/tr/td/div[2]").xpath('text()').to_s.strip

puts "\none line code with full path"
puts post_page.parser.xpath("/html/body/div/div/div/div/div/table/tr/td/div[2]")[0].xpath('text()').to_s.strip

Is it better to use // or / in XPath? @AnthonyWJones says that "the use of an unprefixed //" is not such a good idea.
I had to remove tbody from any working XPath otherwise I got a nil result. How is possible to remove an element from the XPath to get things to work?
Do I have to use xpath twice to extract data if not using a full XPath?
Why ca开发者_StackOverflow中文版n't I make at_xpath work to extract data? It works nicely in "How do I parse an HTML table with Nokogiri?". What is the difference?

// means every node at every level so it's much more expensive compared to /.
You can use * as a placeholder.
No, you can make an XPath query, get the element then call Nokogiri's text method on the node.
Sure you can. Have a look at "What is the absolutely cheapest way to select a child node in Nokogiri?" and my benchmark file. You will see an example of at_xpath.

I found you often use the text() expression. This is not required using Nokogiri. You can retrieve the node then call the text method on the node. It's much less expensive.

Also keep in mind Nokogiri supports CSS selectors. They can be easier if you are working with HTML pages.

继续阅读：nokogiri ruby

How to use Nokogiri's xpath and at_xpath methods

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？