help in extracting a tag out of a DOM of a web page
I am creating a C# application in order to get the DOM info of a Web Page. I cannot extract a TBODY tag using my application. I am using
the control WebBrowser shipped by Visual Studio
a reference to the Com Micros开发者_高级运维oft.mshtml 7.0.3300.0
If I use the Internet Explorer Developer Toolbar ) I can see all information I need.
The tag has id="tbody_id" and carries a list of tags full of data that are showed in attributes innertHTML and innertText.
Using the code below innertHtml and innertText are both null. What I am doing wrong? There are other controls that I can use
mshtml.IHTMLDocument3 domDoc = this.webBrowser.Document.DomDocument as mshtml.IHTMLDocument3; mshtml.IHTMLElement element = domDoc.getElementById("tbody_id"); String innerHtml = element.innerHTML; String innerText = element.innerText;
I have been working on similar - the only thing you might try is to an implicit cast.
I am doing similar and have no trouble with getElementby
IHTMLDocument3 currDocument3 = (IHTMLDocument3)webBrowser.Document.DomDocument; // Cast browser document
IHTMLElement element = currDocument3.getElementById("f15188");
Hope this helps
Roger
For all interested I finally solved this issue.
I simply switched from the WebBrowser control by Microsoft to csEXWB .
A nice column to see how does it work can be found here where I learnt the code to extract correctly the DOM information.
The control must be registered since ti seems to be a COM component (please read notes on the web site and in the column).
Place a cEXWB in your Form like and you will have a web browser control in your app.
// your object somewhere public csExWB.cEXWB cEXWB1;
Go to the web site you want
cEXWB1.Navigate( "http://stackoverflow.com" )
Once loaded get the DOM and each element you want
IHTMLDocument3 domDoc = cEXWB1.WebbrowserObject.Document as mshtml.IHTMLDocument3; IHTMLElement element = domDoc.getElementById("my_id");
精彩评论