I think I need a combo of hpricot and regex here. I need to search for \'a\' tags with an \'href\' attribute that starts with \'abc/\', and returns the text following that until the next forward slash
I\'m using HPricot\'s css search to identify a table withi开发者_开发百科n a web page. Here\'s a sample html snippet I\'m parsing:
In our application we have different themes and each theme has its own default content in the following structure:
I\'d like to slurp the following data about historical inventions into a convenient Ruby data structure:
i just want the text out of there wit开发者_JAVA百科h out those tags. Does Hrpicot.XML have any methods for this? use element.inner_text instead of #inner_html and it removes them for youdoc.search(\"
I\'m trying to get the largest image off a page I parse with Hpricot and am not having any luck. How do I access the width and height attributes of an img tag with i开发者_如何学Got?It is possible, pr
How do I do it开发者_如何学运维? E.g., <span class=\"selected\" id=\"hi\">HELLO</span>
I am getting the following encoding error when trying to scrape web pages with hpricot in ruby 1.9: Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
I\'m trying to scrape a page but the initial response has nothing in the body as the content is pumped in asynchronous开发者_C百科ly, e.g. the results from a search on the apple website: http://www.ap
I\'m using Hpricot to parse an html page, but need to get the computed styles for each element.F开发者_开发问答or example, if I have an h1 Hpricot element and the external CSS for the page has a backg