How to validate a bunch of proxies against a URL?
I have a list of 100 proxies. The URL I am interested in is abc.com. I want to check the number of proxies which can successfully fetch this URL and the time taken for the same. I am hoping I made sense. I am a Python noob. I am looking for a code snippet. A helpi开发者_运维百科ng hand is really appreciated :)
Proxies :
200.43.54.212
200.43.54.212
200.43.54.212
200.43.54.212
URL :
abc.com
Desired result :
Proxy isGood Time
200.43.54.112 n 23.12
200.43.54.222 n 12.34
200.43.54.102 y 11.09
200.43.54.111 y 8.85
p.s : All the above proxies have ports either 80 or 8080
You can fetch URLs using urllib2. To get the amount of time taken, you can use the time module. Here's a simple example that does what you seem to want:
import urllib2
import time
def testProxies(url, proxies):
# prepare the request
req = urllib2.Request(url)
# run the request for each proxy
results = ["Proxy isGood Time"]
for proxy in (proxies):
# now set the proxy
req.set_proxy(proxy, "http")
# time it
start = time.time()
# try to open the URL
try:
urllib2.urlopen(req)
# format the results for success
results.append("%s y %.2f" % (proxy, time.time()-start))
except urllib2.URLError:
# format the results for failure
results.append("%s n %.2f" % (proxy, time.time()-start))
return results
testResults = testProxies("http://www.abc.com", ["200.43.54.112", "200.43.54.222",
"200.43.54.102", "200.43.54.111"])
for result in testResults:
print result
The main points are creating the request with urllib2.Request(url)
and using the set_proxy()
function, which lets you set a proxy for the request.
精彩评论