How can I use SOCKS with HtmlUnit?
Is it possible to use HtmlUnit through SOCKS proxy? Could anyone please provide a code sample?
====
So I've dug through webclient sources, here's the best way I can think of:
Subclass
MultiThreadedHttpConnectionManager
so that it allows setting SOCKS info and if it is set, before returning a Connection, sets SOCKS parametersSubclass
WebConnection
- rewritecreateHttpClient
so that it uses a manager from step 1 and add a method to get that manager directly or http client at first (it is protected now - so bad...)To use 1) create a
WebClient
instance 2) Create subclassedWebConn开发者_如何学运维ection
3) Set it to be used byWebClient
4) Access connection's manager and use it's methods to use socks
All you need to do is set the appropriate system properties before creating your WebClient
object. For example:
System.setProperty("socksProxyHost", "localhost"); // replace "localhost" with your proxy server
System.setProperty("socksProxyPort", "9999"); // replace "9999" with your proxy port number
WebClient client = new WebClient();
At this point, HttpClient (which is used by HtmlUnit under the covers) will pick up the settings and use the SOCKS proxy for all network communication.
UPDATE: I read your revised question (and your comment) and I think you're on the right track. The problem is that if you implement step 1 using the above system properties, then your code is not thread-safe (because those system properties are global). One solution is to synchronize on something, but of course this can introduce performance problems (may not matter to you).
If you really want to control this in a per-socket basis, then I think you will need to do something like the following:
- Create a custom
ProtocolSocketFactory
that passes ajava.net.Proxy
object to theSocket
constructor (like in this example). - Create a custom
Protocol
that uses thisProtocolSocketFactory
. - Apply this
Protocol
to the new connections in your custom connection manager usingHttpConnection.setProtocol()
.
I haven't actually tested this, but based on a quick glance at the HttpClient 3.1 source code, I think that's how it would be done. I would love to hear how you ultimately solve this problem :-). Good luck!
HtmlUnit uses HttpClient as the underlying connection library, I investigated this a little, but:
1- Couldn't find a way to configure HttpClient (except by the generic Java Socks mechanism defined in http://java.sun.com/javase/6/docs/technotes/guides/net/proxies.html)
2- Do not have access to a public Socks Proxy to test against
精彩评论