开发者

how does one remove <![CDATA[ ]]> tags from around text in XML using Hpricot?

i just want the text out of there wit开发者_JAVA百科h out those tags. Does Hrpicot.XML have any methods for this?


use element.inner_text instead of #inner_html and it removes them for you


doc.search("*") do |element|
    element.swap element.content if element.kind_of? Hpricot::CData
end


doc = Hpricot::XML(open('http://www.cnn.com/.element/ssi/www/auto/2.0/video/xml/most_popular.xml'))
(doc/:cnn_video/:video).each do |status|
  ['tease_txt'].each do |el|
    puts "#{status.at(el).inner_text}"
  end
end

Example output (looks spammy but this is not spam!):

New Reno air crash video shows impact
Teen catches 800-pound gator
Resuming careers post 'don't ask' repeal
Creepy skirt peepers
Bus-sized satellite to hit Earth thi ...
'DWTS' cast hits ballroom for first time
What caused trainer's death at SeaWorld?
What led to Troy Davis clemency denial?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜