Creating an aggregate RSS feed from RSS-less search results
So, say I'm a journalist, who wants some way of easi开发者_JAVA百科ly posting links to stories I've written that are published to my newspaper's website. Alas, my newspaper's website doesn't offer user-level RSS feeds (user-level anything for journalists, really).
Running a search (I.e., http://www.calgaryherald.com/search/search.html?q=Rininsland) brings up everything I've done in reverse chronological order (albeit with some duplicates; ignore for now, will deal with later). Is there any way I can parse this into an RSS feed?
It seems like Yahoo! Pipes might be an easy way to do this, but I'm open to whatever.
Thanks!
Normally this would be a great use of Yahoo Pipes, but it appears that the search page you cited has a robots.txt file, which Pipes respects. This means that Pipes will not pull data from the page.
For more info: "How do I keep Pipes from accessing my web pages?"
http://pipes.yahoo.com/pipes/docs?doc=troubleshooting#q14
You would have to write a scraper yourself that makes an HTTP request to that URL, parses the response, and writes RSS as output. This could be done in many server-side environments such as PHP, Python, etc.
EDIT: Feedity provides a service to scrape web pages into feeds. Here is a Feedity feed of your search url: http://feedity.com/rss.aspx/calgaryherald-com/UFJWUVZQ
However, unless you sign up for a subscription ($3.25/mo), this feed will be subject to the following constraints:
Free feeds created without an account are limited to 5 items and 10 hours update interval. Free feeds created without an account are automatically purged from our system after 30 days of inactivity.
Provided it's just links and a timestamp you want for each article then the Yahoo Pipes Search module will return the latest 10 in it's search index of the Herlad site.
精彩评论