OpenURI - 500 Internal Server Error on XML feed
I'm trying to read Stanford ecorner XML:
open("http://ecorner.stanford.edu/RecentlyAdded.xml")
but am running into the following error message:
OpenURI::HTTPError: 500 Internal Server Error
from /usr/local/lib/ruby/1.8/open-uri.rb:277:in `open_http'
from /usr/local/lib/ruby/1.8/open-uri.rb:616:in `buffer_open'
from /usr/local/lib/ruby/1.8/open-uri.rb:164:in `open_loop'
from /usr/local/lib/ruby/1.8/open-uri.rb:162:in `catch'
from /usr/local/lib/ruby/1.8/open-uri.rb:162:in `open_loop'
from /usr/local/lib/ruby/1开发者_开发技巧.8/open-uri.rb:132:in `open_uri'
from /usr/local/lib/ruby/1.8/open-uri.rb:518:in `open'
from /usr/local/lib/ruby/1.8/open-uri.rb:30:in `open'
from (irb):65
from :0
I believe, but I could be wrong, it's because I would need to be logged in to use the feed.
Any workaround I could use?
In case of not being logged in you should get an HTTP response code of 401 Unauthorized
and not 500
. I tried to open the site in the browser, which works. Turns out their web server doesn't like missing user agents, so if you add that open-uri
works:
>> require 'open-uri'
#=> true
>> open("http://ecorner.stanford.edu/RecentlyAdded.xml", 'User-Agent' => 'ruby')
#=> #<File:/var/folders/H9/H9qnar1yGZqBrWFGuTE0RU+++TI/-Tmp-/open-uri20110505-25566-zsc3pd-0>
This is working for me:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::XML(open('http://ecorner.stanford.edu/RecentlyAdded.xml'))
puts doc.search('title').map{ |n| n.text }
>> Recently Added STVP Entrepreneurship Corner Materials
>> STVP Entrepreneurship Corner
>> Podcast: Developing Products that Save Lives - Richard Scheller (Genentech)
>> Podcast: How to Build Instant Connections - Ori Brafman (Author)
>> Podcast: A New Vision for Capital Markets - Barry Silbert (SecondMarket)
>> Podcast: Effective Models for Sustainable Growth - Jennifer Morris (Conservation International)
Note that you got a 500-range error. That means their server is acting up, but is functional enough to admit the problem. If you got a 400-range error they'd be refusing you access to the content for some reason, so I doubt the problem is authentication or anything on your side.
精彩评论