开发者

Download Current WSJ.com Prime Rate

I need to automatically download the current Wall Street Journal Prime Rate and load the data开发者_StackOverflow社区 into my database. What is the best method for downloading this data automatically?

I have come up with three possible solutions for doing this:

  1. Scrape a HTML web page from WSJ.
  2. Parse a RSS news feed from WSJ.
  3. Use some API that I haven't found from WSJ.

Regarding solution 1, although I don't like solution 1 since it could easily break, it's the only one that I have worked out from end to end. It appears I can scrape this page with a WebRequest / WebResponse and read the text in this code:

<tr>
<td style="text-align:left" class="colhead">&nbsp;</td>
<td class="colhead">Latest</td>
<td class="colhead">Wk ago</td>
<td class="colhead">High</td>
<td class="colhead">Low</td>
</tr>
<tr>
<td class="text">U.S.</td>
<td style="font-weight:bold;" class="num">3.25</td>
<td class="num">3.25</td>
<td class="num">3.25</td>
<td class="num" style="border-right:0px">3.25</td>
</tr>

Regarding solution 2, although I can implement a RSS reader solution, I don't see a way to reliably anticipate verbiage for changes in the Prime Rate. Therefore, I don't think this is as safe or reliable a way to get the data as solution 1.

Regarding solution 3, I haven't found any published API's for checking money rates like the Prime Rate. If anyone knows of a web service or other API for checking money rates, then please let me know.


I implemented the following code to scrape the HTML and process the results:

SSIS Package C# Script Task

public void Main()
    {
        // Post the web page.
        try
        {
            // Set variables.
            bool fireAgain = true;
            Uri WebPageURI = new Uri("http://online.wsj.com/mdc/public/page/2_3020-moneyrate.html");

            // Post the web page.
            WebRequest request = WebRequest.Create(WebPageURI);
            request.Timeout = (1000 * 60 * 60);
            request.Method = "POST";
            WebResponse response = request.GetResponse();
            StreamReader reader = new StreamReader(response.GetResponseStream());
            string Output = reader.ReadToEnd();

            Dts.Variables["wall_street_journal_str"].Value = Output;
            Dts.Events.FireInformation(0, "WebRequest:", WebPageURI.ToString(), "", 0, ref fireAgain);
            Dts.Events.FireInformation(0, "WebResponse:", Output, "", 0, ref fireAgain);
        }
        catch (WebException ex)
        {
            Dts.Events.FireError(0, "Error:", ex.Message, "", 0);
        }


        // Return success.
        Dts.TaskResult = (int)ScriptResults.Success;
    }

Stored Procedure Called by an Execute SQL Task

SET @WALL_STREET_JOURNAL_HTML = SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1), CHARINDEX('Canada', @WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1)) - CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1))
    SELECT @RATE_CHANGE_DATE = CONVERT(DATE, RTRIM(LTRIM(SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) + 23, CHARINDEX(']', @WALL_STREET_JOURNAL_HTML, 1) - CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) - 23))))
        , @RATE = CONVERT(NUMERIC(8, 2), SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) + 12, CHARINDEX('<', @WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1))) - CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) -12))

Ugly code, but it serves the purpose. There are also several validation rules that run to verify the data is correct, but this is the core of what I used to scrape the value.


There is a webservice for this, but it may not be worth the cost if all you want is the prime rate:

http://www.armindexes.com/


I'm about 5 years late, but thought I'd bring it up. Quandl has a nice API for getting this information.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜