开发者

Why are HttpWebRequest and WebBrowser getting different HTML source code?

I am trying to get the source code f开发者_开发知识库rom a webpage. The WebBrowser control is giving me the information that I am looking for. However, I want to use HttpWebRequest, but its giving me different source code than the WebBrowser DocumentText.

Can anyone please tell me how can I get the same source code as WebBrowser using HttpWebRequest?

WebBrowser Code:

WebBrowser1.Navigate("http://www.networksolutions.com/whois/results.jsp?domain=" & txtUrl.Text)
textbox1.Text = WebBrowser1.DocumentText

WebBrowser Result:

http://textbin.com/f4368

HttpWebRequest Code:

Dim request As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create(url)
request.KeepAlive = False
request.Timeout = 10000

Dim response As System.Net.HttpWebResponse = request.GetResponse()

Dim sr As System.IO.StreamReader = New System.IO.StreamReader(response.GetResponseStream())
Dim sourcecode As String = sr.ReadToEnd()

HttpWebRequest Result:

http://textbin.com/2h445


Some sites will look at the user-agent string or other factors and return content that varies based on this. I've written a number of projects that downloaded web pages and have run into this a few times.


This is an old-ish question but the reason this happens is that MSHTML - the Windows html rendering engine - modifies the incoming HTML before it renders it. You can change the rendering mode of the .NET web browser to use any of IE7, 8, or 9 rendering engines and you'll see HUGE differences in the HTML they return back out of the browser - IE9's is going to be the most similar to what you see come down in HttpWebRequest.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜