C# HttpWebRequest shows 404, but site is reachable in browser
I am trying to download an xml file from a website with c#, but I get an 404 on some urls. this is wired because they still work in the browser. Other urls still work without a problem.
HttpWebRequest request = (HttpWebRequest)
WebRequest.Create(url);
request.Method = "GET";
request.Timeout = 3000;
request.UserAgent = "Test Client开发者_如何学运维";
HttpWebResponse response = null;
try
{
response = (HttpWebResponse)
request.GetResponse();
}
catch (WebException e)
{
response = (HttpWebResponse)e.Response;
}
Console.WriteLine("- "+response.StatusCode);
XmlTextReader reader = XmlTextReader(response.GetResponseStream());
This URL is one of the said problem URLs:
http://numerique.bibliotheque.toulouse.fr/cgi-bin/oaiserver?verb=ListMetadataFormats
SOLVED....forgot to trim the url ;)
I can only speculate that the host site might not like your UserAgent
and is returning a 404 message
I solved this problem by using this:
var client = (HttpWebRequest)WebRequest.Create(uri);
client.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
client.CookieContainer = new CookieContainer();
client.UserAgent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36";
var htmlCodae = client.GetResponse() as HttpWebResponse;
For downloading xml document you can use DownloadString method:
System.Net.WebClient client = new System.Net.WebClient();
String url = "http://stackoverflow.com/feeds/question/4188449";
String xmlSource = client.DownloadString(url);
Console.WriteLine(xmlSource);
Maybe
1) Somehow you input incorrect url: can you try to put
WebRequest.Create(@"http://numerique.bibliotheque.toulouse.fr/cgi-bin/oaiserver?verb=ListMetadataFormats");
instead of
WebRequest.Create(url);
for testing purpose.
2) You have some HTTP filtering mechanism which distinguishes between VS & browser requrests
精彩评论