HtmlUnit - Selecting Forms, CheckBoxes, TextFields, and Submit Buttons
I have been messing around with HtmlUnit for a little bit and particularly this website because it has quite a few features that I wanted to get used to. I have posted about it before but that was mainly for grabbing information off the site which ended up successful. Now I am wanting to fill in a form and submit it.
Current Test Code:
def url = "http://www.hidemyass.com/proxy-list/"
client = new WebClient(BrowserVersion.FIREFOX_3)
client.javaScriptEnabled = false
page = client.getPage(url)
form = page.getFormByName("proxyform")
//get portInputField and set value
portField = form.getInputByName("p")
portField.setValueAttribute("80")
//select checkbox 1 & 2 from anonymity level
//click "Update Results"
//get new page url
//grab information
//save
The section commented out is where I am unsure of what to do. I went ahead and attempted but would like to ask for input on what I should be doin开发者_如何学运维g.
Attempt:
def url = "http://www.hidemyass.com/proxy-list/"
page = client.getPage(url)
portField = page.getHtmlElementById("ports").setValueAttribute("80")
submitButton = page.getByXPath("/html/body//form//input[@type='image']")
page2 = submitButton.get(0).click()
println page2
The snippet above prints out: HtmlPage(http://www.hidemyass.com/proxy-list/search-1)@17168934
I'm looking to get a new page where I can then parse the information from the search. Any ideas?
I don't believe the language I am using should make too much of a difference; however, I am using Groovy.
EDIT
I managed to get what I wanted but it returns like so:
HtmlPage(http://www.hidemyass.com/proxy-list/search-1)@23713629
<?xml version="1.0" encoding="UTF-8"?><td>109.123.00.00</td>
Is there a way to get only the information that I'm looking for : <td>109.123.00.00</td>
or do I just need to strip the info from it manually?
EDIT
.asText() solved my issue, but gave quite a few warnings regarding the CSS. Should I be worried?
Is there a way to get only the information that I'm looking for : 109.123.00.00 or do I just need to strip the info from it manually?
This should work:
def td = page2.getElementByName("td")
assert td.textContent == "109.123.00.00"
See the JavaDoc for HtmlPage for other ways to extract information from a page. Don't parse the page manually.
Side note: Since you are already using Groovy, you could also have a look at Geb, a popular Groovy-based web automation and testing tool that's more convenient to use than HtmlUnit.
精彩评论