开发者

How do I get embedded elements in text with nokogiri?

Now that's my html:

<div class="cardtextbox"><i>(<img src="/Handlers/Image.ashx?size=small&amp;name=BR&amp;type=symbol" alt="Black or Red" align="absbottom" /> can be paid with either <img src="/Handlers/Image.ashx?size=small&amp;name=B&amp;type=symbol" alt="Black" align="absbottom" /> or <img src="/Handlers/Image.ashx?size=开发者_开发问答small&amp;name=R&amp;type=symbol" alt="Red" align="absbottom" />.)</i></div><div class="cardtextbox"><img src="/Handlers/Image.ashx?size=small&amp;name=3&amp;type=symbol" alt="3" align="absbottom" /><img src="/Handlers/Image.ashx?size=small&amp;name=B&amp;type=symbol" alt="Black" align="absbottom" />, Discard a card: Target creature gets -2/-2 until end of turn.</div><div class="cardtextbox"><img src="/Handlers/Image.ashx?size=small&amp;name=3&amp;type=symbol" alt="3" align="absbottom" /><img src="/Handlers/Image.ashx?size=small&amp;name=R&amp;type=symbol" alt="Red" align="absbottom" />: Put a 2/1 red Goblin creature token with haste onto the battlefield. Exile it at the beginning of the next end step.</div></div>

And what I'd like to get:

[
 ["(B/R can be paid with either B or R.)"],
 ["3 B, Discard a card", "Target creature gets -2/-2 until end of turn"],
 ["3 R",                 "Put a 2/1 red Goblin creature token with haste onto the battlefield. Exile it at the beginning of the next end step."]
]

Mapping from Red => R is done via colorhash. The Red comes form the img tag, alt attribute.


I'm not sure what colorhash is, or how it's relevant to this question, but here's something that should get you close. If you really want a nested array of answers, you'll have to define a recursive function to process a node and determine for yourself if a node is a leaf or not.

require 'nokogiri'
colorhash = {
  'Red'          => 'R',
  'Black'        => 'B',
  'Black or Red' => 'B/R'
}

h = Nokogiri::HTML html_from_question

# Replace all images with alt text, possibly transformed
h.xpath('//img[@alt]').each{ |i| i.swap( colorhash[i['alt']] || i['alt']) }

require 'pp'
pp h.css('.cardtextbox').map(&:text)

#=> ["(B/R can be paid with either B or R.)",
#=>  "3B, Discard a card: Target creature gets -2/-2 until end of turn.",
#=>  "3R: Put a 2/1 red Goblin creature token with haste onto the battlefield. Exile it at the beginning of the next end step."]
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜