开发者

I am downloading a link from nseindia site using my program but now I cant do it there is a error 403?

I am downloading a file from nseindia site using my program but now there is a error 403(forbidden)(page not found). Also a value for the same site using this code

WebClient client = new WebClient();
client.Headers.Add("user-agent", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)");

Str开发者_高级运维eam data;
try
{
    data = client.OpenRead("http://www.nseindia.com/");
}
catch (Exception e)
{
    MessageBox.Show("Error: " + e.Message + e.Data + e.HelpLink);
    return "";
}
StreamReader reader = new StreamReader(data);
string s = null;
int count = 0;
while (reader.Read()>0)
{
    s = reader.ReadLine();
    if (s.Contains("<td class=\"t1\">"))
    {
        MessageBox.Show("Line: " + s);
       s= s.Remove(0, 18);
       s = s.Remove(s.Length - 5);
       count++;
       if (count == 5)
           break;
    }

}

data.Close();
reader.Close();
return s;


It seems that this site requires an Accept HTTP request header:

client.Headers[HttpRequestHeader.Accept] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";

One of the problems you will encounter with what you are currently doing is that you are totally dependent on how the site you are trying to scrape works. Not to mention the fragility of your HTML parsing code. What is worse is that this could change anytime and you have no control unless you own the site. Tomorrow the site might start requiring some other HTTP header and your code will stop working once again. Just saying this so that you are prepared.

Maybe you could contact the site owners and see if they are offering an official API to consume their content.


What you are getting technically is :

HTTP - 403

In simple language it says that you are not authorized to access this resource.

Check whether your domain is blocking requests to this site. Just try to open it in your browser and see if it is opening fine.


Looks like NSE made some changes, now you need to use these two headers:

client.Headers[HttpRequestHeader.Accept] = "text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8";

client.Headers.Add("user-agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31");

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜