C# webbrowser alters source
I have a webbrowser control which I navigate to an URL that contains this html:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
<title></title>
</head>
<body marginheight="60" topmargin="60">
<p align="center"><img src="nocontent.jpg" alt="" height="434" width="525" border="0" /></p>
</body>
</html>
开发者_JS百科
But when I use this code to fetch the source:
HTMLDocument objHtmlDoc = (HTMLDocument)browser.Document.DomDocument;
string pageSource = objHtmlDoc.documentElement.innerHTML;
Console.WriteLine(pageSource);
This is the result:
<HEAD><TITLE></TITLE>
<META content=text/html;charset=utf-8 http-equiv=content-type></HEAD>
<BODY topMargin=60 marginheight="60">
<P align=center><IMG border=0 alt="" src="nocontent.jpg" width=525 height=434></P></BODY>
This is no good for further processing, how can I make sure it shows the same source as when I would rightclick it and select "view source"?
Use browser.DocumentText
to obtain the source HTML.
Using the HTMLDocument
class will cause it to generate HTML from the conceptual model of the document rather than displaying the original source.
精彩评论