开发者

How do I bypass undefined JavaScript errors when doing a web scrape with a HttpWebRequest

I am trying to do a HttpWebRequest to scrape a login page's HTML and post data to that web page's username and password controls to 'auto logon' the client essentially. The problem I am running into, is the URI I am using when parsed has JavaScript within and a XML .config file that is not getting downloaded and only available to the actual hosted site it appears.

When I read the response, execution breakes in VS.NET and I get a "Microsoft JScript Runtime error: 'VerticalMenuCon开发者_如何学编程fig is undefined"

Looking at the script tags where debugging has stopped I see the reference to "scripts/VerticalMenuConfig.xml".

If I hit continue, the page is all messed up when written to the response of my page, and I still unsuccessfully write to any of the input text boxes. I am using an example like the following that writes to the "Search" text box automatically on Amazon; it is exactly what I need to do: http://www.worldofasp.net/tut/WebRequest/Working_with_HttpWebRequest_and_HttpWebResponse_in_ASPNET_114.aspx

Here is the code that breaks:

<script language="javascript" src="scripts/VerticalMenu.js"></script> 
<script> 
var vm = new VerticalMenu("tdVertMenu", "scripts/VerticalMenuConfig.xml", ""); 
vm.openSubMenu("clientLogin");

It breaks on the line: var vm = new VerticalMenu...

Is there a better way to do this?


It sounds like you have logged your webserver in to your client's account and are expecting to serve your client with html from that server. That probably won't work. Security, session tokens that the webserver has and not the user, relative urls in the scraped html (like your problem with the xml).

You could serve up an auto-submitting form pointed at the other web server - that form would have to contain the user's credentials in plain text!

<form id='f' method='post' action='http://otherwebserver/login'>
<input type='hidden' name='username' value='me' />
<input type='hidden' name='password' value='my password' />
// any other fields that the login page expects
</form>
<script type='text/javascript'>
  document.getElementById('f').submit();
</script>


You're describing a webpage that is already broken, you can't browse webpages that are broken, regardless of using an HttpClient or Chrome.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜