开发者

C# htmlagilitypack the operation has timed out

How do you increase the timeout value for htmlagiliypack? I'm getting this error alot but I want to increase the timeout limit, or how do you kill the request and try again?

resultingHTML = null;
        try
        {
            string htmlstring = string.Empty;
            HttpWebRequest newwebRequest = (HttpWebRequest)WebRequest.Create(htmlURL);
            HttpWebRespon开发者_Python百科se mywebResponce = (HttpWebResponse)newwebRequest.GetResponse();
            if (mywebResponce.StatusCode == HttpStatusCode.OK)
            {
                Stream ReceiveStream = mywebResponce.GetResponseStream();
                using (StreamReader reader = new StreamReader(ReceiveStream))
                {
                    htmlstring = reader.ReadToEnd();
                }
                HtmlDocument doc = new HtmlDocument();
                doc.Load(htmlstring);
                HtmlWeb hwObject = new HtmlWeb();
                HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");
                resultingHTML = body.InnerHtml.ToString();
            }

        }


I assume you're using HtmlAgility pack to read HTML via a web request here?

I would advise using the framework WebRequest object instead,

http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700

..where you can specify a timeout. You catch timeout (and other connection errors) just by wrapping in a try/catch block.

Then parse the resulting HTML from the WebResponse object via HtmlAgility directly.

Here is an example of how to get the html from the WebResponse

http://msdn.microsoft.com/en-us/library/system.net.webresponse.getresponsestream.aspx

Once you have the html as a string from the WebResponse you would:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);


 HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("wwww.someurl.com");
        httpWebRequest.Timeout = 10000; // 10 second timeout
        using(HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse())
        {
            if (httpWebResponse.StatusCode == HttpStatusCode.OK)
            {
                using(Stream responseStream = httpWebResponse.GetResponseStream())
                {
                    using (StreamReader reader = new StreamReader(responseStream))
                    {
                        var htmlstring = reader.ReadToEnd();
                         HtmlDocument doc = new HtmlDocument();
                         doc.Load(htmlstring);
                    }
                }

            }
        }

I would also look at: Adjusting HttpWebRequest Connection Timeout in C#

Just to understand the difference bettween TimeOut and ReadWriteTimeout on the HttpWebRequest class.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜