Inexplicable Urllib2 problem between virtualenv's.
I have some test code (as a part of a webapp) that uses urllib2 to perform an operation I would usually perform via a browser:
- Log in to a remote website
- Move to another page
- Perform a POST by filling in a form
I've created 4 separate, clean virtualenvs (with --no-site-packages) on 3 different machines, all with different versions of python but the exact same packages (via pip requirements file), and the code only works on the two virtualenvs on my local development machine(2.6.1 and 2.7.2) - it won't work on either of my production VPSs
In the failing cases, I can log in successfully, move to the correct page but when I submit the form, the remote server replies telling me that there has been an error - it's an application server error page ('we couldn't complete your request') and not a webserver error.
- because I can successfully log in and maneuver to a second page, this doesn't seem to be a session or a cookie problem - it's particular to the final POST
- because I can perform the operation on a particular machine with the EXACT same headers and data, this doesn't seem to be a problem with what I am requesting/posting
- because I am trying the code on two separate VPS rented from different companies, this doesn't seem to be a problem with the VPS physical environment
- because the code works on 2 different python versions, I can't imagine it being an incompabilty problem
I'm completely lost at this stage as to why this wouldn't work. I've even 'turned-it-off-and-turn-it-on-again' because I just can't see what the problem could be.
I think it has to be something to do with the final POST coming from a VPS that the remote server doesn't like, but I can't figure out what that could be. I feel like there is something going on under the hood 开发者_高级运维of URLlib that is causing the remote server to dislike the reply.
EDIT I've installed the exact same Python version (2.6.1) on the VPS as is on my working local copy and it doesn't work remotely, so it must be something to do with originating from a VPS. How could this effect the Http request? Is it something lower level?
You might try setting the debuglevel=1 for urllib2 and see what it comes up with:
import urllib2
h=urllib2.HTTPHandler(debuglevel=1)
opener = urllib2.build_opener(h)
...
This is a total shot in the dark, but are your VPSs 64-bit and your home computer 32-bit, or vice versa? Maybe a difference in default sizes or accuracies of something could be freaking out the server.
Barring that, can you try to find out any information on the software stack the web server is using?
I had similar issues with urllib2 (working with Zimbra's REST api), in the end switched to pycurl with success.
PS for operations of login/navigate/post, I usually find Mechanize useful and easier to use. Maybe you can give it a show.
Well, it looks like I know why the problem was happening, but I'm not 100% the reason for it.
I simply had to make the server wait (time.sleep()) after it sent the 2nd request (Move to another page) before doing the 3rd request (Perform a POST by filling in a form).
I don't know is it because of a condition with the 3rd party server, or if it's some sort of odd issue with URLlib? The reason it seemed to work on my development machine is presumably because it was slower then the server at running the code?
精彩评论