How can I extract html escape chars/entities as text when scraping web? (ruby & nokogiri)

2022-12-17 02:57 问答作者：

In my ruby+mechanize(nokogiri) script I use this piece of code:

row.at_xpath('td[3]/div[1]/a/text()').to_s.strip

on a forum where the post title html looks like:

<a href="showthread.php?t=233891" >&lt;/body&gt; on Footer ?</a>

and I recei开发者_开发技巧ve from xpath this string </body> on Footer ?

I would like to get what I can see in the web browser </body> on Footer ?

How can I do that for all html escape characters/entities?

Please take a look this post, to unescape htmlentities

There is a ruby package called htmlentities

精彩评论