开发者

Get HTML source code from browser control embedded in C#

I have a browser control embeded in a C# windows app. I want to grab the rendered HTML (which could have been modified by javascript) n开发者_高级运维ot the original one.

Any suggestions?


You can get the HTML, and indeed set it, with WebBrowser.DocumentText.

Sheng is correct, DocumentText returns the streamed document before scripts run. His code doesn't compile, but it's essentially correct. I found that you need:

mshtml.HTMLDocument doc = webBrowser1.Document.DomDocument as mshtml.HTMLDocument;
string html = doc.documentElement.outerHTML;


DocumentText internally use the document's IPersistStream interface which returns the original HTML. Use webBrowser1.Document.DocumentElement.OuterHTML instead.


Add a Navigated event to your WebBrowser. Only then will your document be filled.

    private void webBrowser1_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        Console.WriteLine(webBrowser1.DocumentText);
    }
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜