开发者

In Ruby on Rails, why does a script match /something([^<]+)/ perfectly, but match the </td> also when it is the "script/runner" mode?

I tried a simple script with

arr = data.scan /<td>([^<]+)/

and the arr is filled with the data within the <td> and </td> when it is run using

ruby try.rb

but when it is run using

ruby script/runner app/try.rb

so that it is run just like inside of script/console, then now there is an extra </td> attached to the matched data... Why would that be? It i开发者_如何学Gos Ruby 1.8.7 with Rails 2.3.8. Would it be due to unicode in the app environment or something else?


I would leave this as a comment because it doesn't really answer anything but I can't, I'm new around here and I guess I don't have the rep to do so, please excuse me.

I mocked the setup, used ruby 1.8.7 with an fully functional app on rails 2.3.8 and both times I got the proper output without the trailing you mention. Now I am curious as to what's in data ? I used a generic table into a pretty simple html document. Works as it should.

One last thing worth mentioning maybe, regex to parse html is it a good idea ? I never had the need to use it but hpricot looks pretty neat for just that sort of thing http://github.com/hpricot/hpricot.

Hope this helps at least a little.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜