开发者

Rails 3 pulling data from another site

I have a client request on one of my projects where they want to be able to enter a url and have it pull in some information form the site who's url they entered and save it in the database.

So the user enters: http://www.example.com/2342342 and my controller visits that site, and gets the content of the first <h1>Tag</h1> on the site and saves this in the database. Is this possible? If so, how would I go about开发者_开发知识库 doing it? Would I use some rails commands to do it, or something else, like jQuery?


Nokogiri is a great parser and can work directly with an url.

So two steps there:

  1. Instantiate a Nokogiri object with the url as param

  2. Parse the html page to get what you expect

Find instructions here: http://nokogiri.org/tutorials/parsing_an_html_xml_document.html

Because you'll work with another website, keep in mind two advice:

  • wrap your queries so that you can rescue if the website is down

  • consider using ajax request because it could be long


I would checkout the Railscast here:

http://railscasts.com/episodes/190-screen-scraping-with-nokogiri

It's explained very well on how to use Nokogiri and scrape content from other sites.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜