HtmlUnit to view source
HtmlUnit for Java is great but I haven't been able to figure out how to view the full source or retu开发者_C百科rn the source of a web site as a string. can anyone help me with this?
I know the follow will read the site but now I just want to return the source to a string.
HtmlPage mySite = webClient.getPage("http://mysite.com");
Thanks!
From looking through the API, my thought would be:
mySite.getWebResponse().getContentAsString();
String pageSource = myPage.asXml();
That will get you the full HTML source of the web page.
String pageText = myPage.asText();
That will get you all of the visible text on the page, including line breaks/white space. It would be the same if you were on the page in your browser and Ctrl+A
then Ctrl+V
into a variable.
have you tried mySite.asXml()
? Or you can do mySite.getDocumentElement().toString()
精彩评论