HttpWebRequest versus browser request

2023-02-02 10:26 问答作者：

I used to retrieve data from a site using a c# program.(nseindia.com) however recently NSE made some changes so that any request from any program is responded with a “403 Forbidden Error”. Can anyone tell me a way to make the request from the program identical to that from the browser. I tried setting the userAgent property but thats not working. The code is pasted below.

string DownloadData(string CompanyName)
{
    string address = string.Format(@"http://www.nseindia.com");
    //http://www.nseindia.com/marketinfo/sym_map/symbolMapping.jsp?dataType=priceVolumeDeliverable&symbol=abb&
    //http://www.nseindia.com/content/equities/scripvol/datafiles/01-12-2008-TO-29-12-2010ABBALLN.csv
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(address);
    request.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3

    string strData = "";
    try
    {
        request.Proxy = WebProxy.GetDefaultProxy();
        HttpWebResponse response = (HttpWebRespons开发者_运维问答e)request.GetResponse();
        System.IO.Stream stream = response.GetResponseStream();
        System.Text.Encoding ec = System.Text.Encoding.GetEncoding("utf-8");
        System.IO.StreamReader reader = new System.IO.StreamReader(stream, ec);
        strData = reader.ReadToEnd();
        if (strData.Contains("Error"))
        {
            Exception e = new Exception(strData);
            throw e;
        }
    }
    catch(Exception e)
    {
        Console.WriteLine(e.ToString());
    }

    return strData;
}

Your best bet is to spy your browser to see exactly the requests sent and responses received.

There is numerous addins for that, depending on your browser.

Try setting the Accept HTTP header; e.g.:

request.Accept = "Accept: text/html,application/xhtml+xml,application/xml";

I arrived at this suggestion by running Fiddler2 (as suggested in a comment to another answer) in order to see how my browser (Firefox 4 Beta) makes the HTTP request to the website you mentioned.

I then set all headers in the code and eliminated one by one. As soon as I removed the Accept header, the 403 status code was returned.

Exact request made by my browser:

GET / HTTP/1.0
Host: www.nseindia.com
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:2.0b8) Gecko/20100101 Firefox/4.0b8
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: de,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

PS: The other URIs you mention in the comments seem to be invalid. One is incomplete and yields a 500 Internal Server Error, the other yields a 404 Not Found response.

Try to set credentials as default like this

request.Credentials = System.Net.CredentialCache.DefaultCredentials;

NetworkCredential nc = new NetworkCredential("user", "password");
request.Credentials = nc;

if you need username password to access that web page

or an another option is to use WebBrowser control ;)

HttpWebRequest versus browser request

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？