开发者

Strip all tbody tags without destroying their children

This Ruby code using Nokogiri

doc.xpath("//tbody").remove

removes the children of the <tbody> (as well as the <tbody> themselves). I only want to remove all <tbody> tags from the document, leaving their child开发者_运维技巧ren in place. How can I achieve this?


require 'rubygems'
require 'nokogiri'

html = Nokogiri::HTML(DATA)
html.xpath('//table/tbody').each do |tbody|
  tbody.children.each do |child|
    child.parent = tbody.parent
  end
  tbody.remove
end

puts html.xpath('//table').to_s

__END__
<table border="0" cellspacing="5" cellpadding="5"><tbody>
<tr><td>Data</td></tr>
<tr><td>Data2</td></tr>
<tr><td>Data3</td></tr>
</tbody></table>

prints

<table border="0" cellspacing="5" cellpadding="5">
<tr><td>Data</td></tr>
<tr><td>Data2</td></tr>
<tr><td>Data3</td></tr>
</table>


You want to replace each tbody with its children? Then that's all you need to say:

require 'nokogiri'
html = Nokogiri::HTML.fragment(DATA.read)
html.css('tbody').each{ |tbody| tbody.replace tbody.children }
puts html

__END__
<table><tbody>
  <tr><td>Data</td></tr>
  <tr><td>Data2</td></tr>
</tbody><tbody>
  <tr><td>Data3</td></tr>
</tbody></table>

Producing:

<table>
<tr><td>Data</td></tr>
<tr><td>Data2</td></tr>
<tr><td>Data3</td></tr>
</table>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜