Find the largest table in a page
Could someone tell m开发者_运维技巧e how I can find the largest table in a web page (i.e. the one with the most rows) using Nokogiri? Can this be done using Lambda functions?
biggest_table = doc.xpath('//table').max_by do |table|
table.xpath('.//tr').length
end
Or, in case there's a tie, perhaps you want a list of all tables with the most rows:
# Hash mapping number of rows to array of table nodes
tables = doc.xpath('//table').group_by{ |t| t.xpath('.//tr').length }
# Array of all (possibly only 1) tables with the most rows
biggest_n = tables[table.keys.max]
This might not be what you are looking for, but you can do this easily on the browser using jQuery or Prototype.
tables = @doc.xpath('//table')
tr_count = tables.map{|n| n.xpath('tr|*/tr').length}
biggest_table = tables[tr_count.index(tr_count.max)]
精彩评论