开发者

How can I use SOCKS with HtmlUnit?

Is it possible to use HtmlUnit through SOCKS proxy? Could anyone please provide a code sample?

====

So I've dug through webclient sources, here's the best way I can think of:

  1. Subclass MultiThreadedHttpConnectionManager so that it allows setting SOCKS info and if it is set, before returning a Connection, sets SOCKS parameters

  2. Subclass WebConnection - rewrite createHttpClient so that it uses a manager from step 1 and add a method to get that manager directly or http client at first (it is protected now - so bad...)

  3. To use 1) create a WebClient instance 2) Create subclassed WebConn开发者_如何学运维ection 3) Set it to be used by WebClient 4) Access connection's manager and use it's methods to use socks


All you need to do is set the appropriate system properties before creating your WebClient object. For example:

System.setProperty("socksProxyHost", "localhost"); // replace "localhost" with your proxy server
System.setProperty("socksProxyPort", "9999"); // replace "9999" with your proxy port number

WebClient client = new WebClient();

At this point, HttpClient (which is used by HtmlUnit under the covers) will pick up the settings and use the SOCKS proxy for all network communication.

UPDATE: I read your revised question (and your comment) and I think you're on the right track. The problem is that if you implement step 1 using the above system properties, then your code is not thread-safe (because those system properties are global). One solution is to synchronize on something, but of course this can introduce performance problems (may not matter to you).

If you really want to control this in a per-socket basis, then I think you will need to do something like the following:

  1. Create a custom ProtocolSocketFactory that passes a java.net.Proxy object to the Socket constructor (like in this example).
  2. Create a custom Protocol that uses this ProtocolSocketFactory.
  3. Apply this Protocol to the new connections in your custom connection manager using HttpConnection.setProtocol().

I haven't actually tested this, but based on a quick glance at the HttpClient 3.1 source code, I think that's how it would be done. I would love to hear how you ultimately solve this problem :-). Good luck!


HtmlUnit uses HttpClient as the underlying connection library, I investigated this a little, but:

1- Couldn't find a way to configure HttpClient (except by the generic Java Socks mechanism defined in http://java.sun.com/javase/6/docs/technotes/guides/net/proxies.html)
2- Do not have access to a public Socks Proxy to test against
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜