Download Current WSJ.com Prime Rate
I need to automatically download the current Wall Street Journal Prime Rate and load the data开发者_StackOverflow社区 into my database. What is the best method for downloading this data automatically?
I have come up with three possible solutions for doing this:
- Scrape a HTML web page from WSJ.
- Parse a RSS news feed from WSJ.
- Use some API that I haven't found from WSJ.
Regarding solution 1, although I don't like solution 1 since it could easily break, it's the only one that I have worked out from end to end. It appears I can scrape this page with a WebRequest / WebResponse and read the text in this code:
<tr>
<td style="text-align:left" class="colhead"> </td>
<td class="colhead">Latest</td>
<td class="colhead">Wk ago</td>
<td class="colhead">High</td>
<td class="colhead">Low</td>
</tr>
<tr>
<td class="text">U.S.</td>
<td style="font-weight:bold;" class="num">3.25</td>
<td class="num">3.25</td>
<td class="num">3.25</td>
<td class="num" style="border-right:0px">3.25</td>
</tr>
Regarding solution 2, although I can implement a RSS reader solution, I don't see a way to reliably anticipate verbiage for changes in the Prime Rate. Therefore, I don't think this is as safe or reliable a way to get the data as solution 1.
Regarding solution 3, I haven't found any published API's for checking money rates like the Prime Rate. If anyone knows of a web service or other API for checking money rates, then please let me know.
I implemented the following code to scrape the HTML and process the results:
SSIS Package C# Script Task
public void Main()
{
// Post the web page.
try
{
// Set variables.
bool fireAgain = true;
Uri WebPageURI = new Uri("http://online.wsj.com/mdc/public/page/2_3020-moneyrate.html");
// Post the web page.
WebRequest request = WebRequest.Create(WebPageURI);
request.Timeout = (1000 * 60 * 60);
request.Method = "POST";
WebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string Output = reader.ReadToEnd();
Dts.Variables["wall_street_journal_str"].Value = Output;
Dts.Events.FireInformation(0, "WebRequest:", WebPageURI.ToString(), "", 0, ref fireAgain);
Dts.Events.FireInformation(0, "WebResponse:", Output, "", 0, ref fireAgain);
}
catch (WebException ex)
{
Dts.Events.FireError(0, "Error:", ex.Message, "", 0);
}
// Return success.
Dts.TaskResult = (int)ScriptResults.Success;
}
Stored Procedure Called by an Execute SQL Task
SET @WALL_STREET_JOURNAL_HTML = SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1), CHARINDEX('Canada', @WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1)) - CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1))
SELECT @RATE_CHANGE_DATE = CONVERT(DATE, RTRIM(LTRIM(SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) + 23, CHARINDEX(']', @WALL_STREET_JOURNAL_HTML, 1) - CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) - 23))))
, @RATE = CONVERT(NUMERIC(8, 2), SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) + 12, CHARINDEX('<', @WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1))) - CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) -12))
Ugly code, but it serves the purpose. There are also several validation rules that run to verify the data is correct, but this is the core of what I used to scrape the value.
There is a webservice for this, but it may not be worth the cost if all you want is the prime rate:
http://www.armindexes.com/
I'm about 5 years late, but thought I'd bring it up. Quandl has a nice API for getting this information.
精彩评论