开发者

Ruby Regex Help

I know a little bit of regex, but not mutch. What is the best way to get just the number out of the following html. (I want to have 32 returned). the values of width,row span, and size are all different in this horrible html page. Any help?

<td width=14 rowspan=2 align=right><font size=2 face="helvetica">32</font></td>
开发者_开发知识库


How about

>(\d+)<

Or, if you desperately want to avoid using capturing groups at all:

(?<=>)\d+(?=<)


Please, do yourself a favor:

#!/usr/bin/env ruby
require 'nokogiri'

require 'test/unit'
class TestExtraction < Test::Unit::TestCase
  def test_that_it_extracts_the_number_correctly
    doc = Nokogiri::HTML('<td width=14 rowspan=2 align=right><font size=2 face="helvetica">32</font></td>')
    assert_equal [32], (doc / '//td/font').map {|el| el.text.to_i }
  end
end


May be

<td[^>]*><font[^>]*>\d+</font></td>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜