C# parsing html with javascript
I need to parse html code after executing javascript code inside this document. I use webBrowser control for downloading and controling html.
For example, I have some javascript in my html code.
<script type="text/javascript" src="http://site.com/script.js"></script>
Thank for your answers.
P.S. I mean: I must parse all code with some text wich can return javascript. So, I can parse document only after execution javascript. Becouse I need some part of dinamic content wich will be added with javascript.
Added
I got content with javascript generated content. I skipped this one, because I was looking for some content that was in iframe which was generated with javascript.
And now I have another question. In my document I have few iframes. I am trying to get content from some frames. In the next way:
var htmlcol = webBrowser1.Document.Window.Frames;
foreach (HtmlWindow item in htmlcol)
{
try
开发者_如何学Go {
Console.Write(item.Name);
}
catch (System.Exception ex)
{
MessageBox.Show("Something wrong");
}
}
But in this way I have exception: 'System.UnauthorizedAccessException'. How I can get access to html of frames?
P.P.S. Sory for my bad english :)
I think that you will have a better experience using the DOM as represented using the Document
property of the WebBrowser
.
You can either traverse the nested elements of Body
, or find what you want using GetElementById
or GetElementsByTagName
.
The DOM should be automatically updated by the changes the JavaScript makes in the page.
Try the following: - Add reference Microsoft.mshtml to your application.
Try:
public void setPage(mshtml.HTMLWindow2Class JSFile)
{
HTMLWindow2Class window = new HTMLWindow2Class();
window = JSFile;
}
public void scriptPrint()
{
IHTMLDocument2 doc = null; ;
IHTMLWindow2 parentwindow = doc.parentWindow;
parentwindow.execScript("report_back('Printing complete!')", "JScript");
}
}
Here's also an article that might help you: http://www.dotnetcurry.com/ShowArticle.aspx?ID=194
Please read Phantomjs for your issue and use setTimeOut for page open.
This can loke like this:
var page = require('webpage').create();
page.open("https://sample.com", function(){
page.evaluate(function(){
// Execution somethings before page load. for Example:
localStorage.setItem("something", "whatever");// Set LocalStorage for browser before open
});
page.open("https://sample.com", function(){
setTimeout(function(){
console.log(page.content); //page source
// Where you want to save it
page.render("screenshoot.png")
// You can access its content using jQuery
var fbcomments = page.evaluate(function(){
return $("body").contents().find(".content")
})
phantom.exit();
},10000)
});
});
精彩评论