Web Scraping with Nokogiri::HTML and Ruby - How do you handle when what you are looking for isn't there?
I've got a script that works for 99% of the pages I want to scrape but just a few of them don't have what I am looking for and my script errors out with a
undefined method `attribute' for nil:NilClass (NoMethodError)
The code is a bit ugly from fiddling around and debugging but here is what I am doing. The error is on the third line and is simply because in the error cases there is no .entry-content img:
doc = Nokogiri::HTML(open(url))
开发者_运维百科image_link = doc.css(".entry-content img")
temp = image_link.attribute('src').to_s
How can I detect this and handle the error when the image_link returned by Nokogiri isn't nil?
doc = Nokogiri::HTML(open(url))
if image_link = doc.at_css(".entry-content img")
temp = image_link['src']
else
# Whatever else
end
Alternatively, you could use an XPath selector to get the attribute value directly:
doc = Nokogiri::HTML('<div class="entry-content"><img src="bar"></div>')
src = doc.at_xpath('//*[@class="entry-content"]//img/@src').to_s
# src is "bar"; if the html didn't have such an item, it would be "" (nil.to_s)
精彩评论