开发者

How to screen scrape form results from a browser

I've have a client that is using a third party piece of web software. On 1 screen, my client fills out a form. Before submitting, he wants to run something that captures what he's entered in and inserts it into a csv or database. The CSV or database part is easy. Getting the data from a browser that is already started, running on another web server is the part I don't know how 开发者_开发问答to do.

How can I capture the contents of the html form? I would prefer to use c#, vb.net, vbs or similar but really am interested in anything. I would also prefer not to install custom software on the client workstation except what screen scraping software I write here. I would also prefer for the user to fill out the form and they run my script to gather the data rather than having to run a custom browser instance.

Thanks!


Before the user submits the form, the data only exist in the browser. The browser is the only place you'll be able to get the data from.

You'll need something like a Browser Helper Object, or the equivalent in FireFox. You'll also have to restrict which browsers can be used, and you'll have to maintain this helper tool.

You'll do better to tell the customer "no", or find a better way to do what he really wants (like, maybe the third-party application needs to be able to save what is sent to it).


If you or your client don't mind using a windows forms application, you can add a WebBrowser control and then point it to the third-party web-app. Then, you could try to access the elements of the web page (ie. form fields) through the .Document property of the control. Though I'm not sure if you could access specific form field values or not.

Edit
I was able to do this with what I said. I created a windows forms application, added a webbrowser control (webBrowser1) and then loaded this html into it (with proper <html>, <head> tags and the like):

<form id="form1" method="post" action="test.htm">
  <input type="text" id="testText" name="testText" />
  <br />
  <input type="submit" value="Save" />
</form>

Note: I did this by saving it in an html file and using webBrowser1.Url = new Uri(@"c:\test.htm"); in my Form_Load event.

I was then able to access whatever I typed into testText by doing this:

HtmlDocument doc = webBrowser1.Document; //Gets the html document
HtmlElement elem = doc.Body.All["testText"]; //Gets the input element
MessageBox.Show(elem.GetAttribute("value")); //Gets the value attribute

I hope this helps you.


I decided to use javascript and add a IE favorite or a firefox book mark that achieves this. It retrieves the data from the form, sends the data to an aspx page from the query string. The aspx page then writes the data to the database and displays a pop-up graphic if the write was successful or not.

Here's a sample of the script:

javascript:var oForm = document.forms[0];var name = oForm.elements["name"].value; void window.open("http://www.mydomain.com/page.aspx?data=" + name ,"_blank","resizable,height=130,width=130");

Everyone, thanks for the suggestions!!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜