How to find RSS feed of a particular website?
How to find RSS feed of a particular website? Wh开发者_开发技巧ether there is any particular way to find it?
You might be able to find it by looking at the source of the home page (or blog). Look for a line that looks like this:
<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="http://example.org/rss" />
The href value will be where the RSS is located.
There are multiple ways to get the RSS feed of the website.
What you can do is get the page source of a website and search for this link tag of type="application/rss+xml"
That will contain the RSS feed of that website, if any.
Here is a simple program in python that will print the RSS feed of any website, if any.
import requests
from bs4 import BeautifulSoup
def get_rss_feed(website_url):
if website_url is None:
print("URL should not be null")
else:
source_code = requests.get(website_url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.find_all("link", {"type" : "application/rss+xml"}):
href = link.get('href')
print("RSS feed for " + website_url + "is -->" + str(href))
get_rss_feed("http://www.extremetech.com/")
Save this file with the .py extension and run it. It will give you the rss feed url of that website.
Google also provides APIs to find the RSS feeds of a website. Please find them here: Google Feed API
You need to loop through all urls on your website and then find one that's containing "rss".
Method above maybe won't work in some cases if url in href tag looks something like feed.xml, so in that case you'll need to loop through all tags containing href AND rss, then just parse url from href attribute.
If you want to do this through browser, press CTRL+U to view source, then CTRL+F to open find window and then just type in rss. RSS Feed url should appear immediately.
Firefox's Tools menu now has a "Page Info" command. One of the tabs in that tool displays discovered feed info.
I needed to find sites with RSS feeds. Using Visual Studio (VB) I was able do that. Following code is just a fragment. It dies after the loop finishes but it does find any ref to an rss page on the site. That's all I needed so I never quite finished it. But it worked for me.
Imports System.Net Imports System.IO
... Dim request As WebRequest request = WebRequest.Create("http://www.[site]")
Dim response As WebResponse = request.GetResponse()
Dim responseStream As Stream = response.GetResponseStream()
Dim reader As New StreamReader(responseStream)
Dim line As String = reader.ReadLine()
Dim intPos As Integer
Do
line = reader.ReadLine()
intPos = line.IndexOf("/rss")
If intPos > 0 Then
MessageBox.Show(line + " " + intPos.ToString)
End If
Loop While Not line Is Nothing
....
精彩评论