Programmatically determine whether a Youtube video has been taken down
I'm getting most of the music on Rap Exegesis from YouTube (in the form of embedded players). Unfortunately, there's always the risk that one of the videos I'm using will be taken down (due to copyrigh开发者_运维知识库t issues or whatever), thereby breaking the corresponding page on my site.
Ideally I would have a cronjob that would check (nightly say) whether any videos had been removed and notify me. What's the best way to do this?
The information you need is available via the YouTube API, specifically in the yt:state tag
Depending what language you are programming in there is lots of code around for interacting with the YouTube API.
Post here with more details if you are still having issues getting this to work.
As well as the "yt:state tag", the OP of the video may not allow it to be embedded. If the list of songs on the front page is coming from a playlist that you maintain on YouTube, for example, then a way to make sure you aren't getting songs that aren't embeddable is to include the "&format=5" parameter when retrieving your list. E.g.
http://gdata.youtube.com/feeds/api/playlists/8BCDD04DE8F771B2?v=2&format=5
Also, if you are worried about country-level restrictions, then use the "&restriction=[two-letter country code]" parameter.
See the 'Developer's Guide: Data API Protocol – API Query Parameters'.
A hacky way to do it would be to use CURL to get the html of the page/video you are wondering about, and then look for the error-box DIV that shows up at the top that says the video has been removed. If it exists and its visible, the video has probably been removed.
Hacky, but I betcha it would work.
As @seengee says, the "right" way to do this is to look for the yt:state tag in the XML representation of a YouTube video via the YouTube API
To get this XML representation, you GET http://gdata.youtube.com/feeds/api/videos/VIDEO_ID
(more details here). So implementing this check should be as easy as:
def valid_embed_link?
doc = Hpricot(open("http://gdata.youtube.com/feeds/api/videos/#{youtube_video_id}"))
doc.at('yt:state').blank?
end
Unfortunately this yields false positives. For example, http://www.youtube.com/watch?v=MX6rC1krGp0 plays fine, but http://gdata.youtube.com/feeds/api/videos/MX6rC1krGp0 contains a yt:state
tag. Therefore, I've gone with this hackier method:
def valid_embed_link?
doc = Hpricot(open("http://www.youtube.com/watch?v=#{youtube_video_id}"))
return doc.at('.yt-alert-content').blank?
end
精彩评论