Python script freezes infinitely during a socket connection
I have a simple python script that updates that statuses of justin.tv streams in my database. It's a Django based web application. This script worked perfectly before I moved it to my production server, but now it has issues with timing out or freezing. I've solved the time out problem by adding try/except blocks and making the script retry, but I still can't figure out the freezing problem.
I know it freezes on the line streamOnline = manager.getStreamOnline(stream.name, LOG)
. That's the same point where the socket.timeout
exception occurs. Some times however, it just locks up for ever. I just can't picture a scenario where python would freeze infinitely. Here is the code for the script that freezes. I'm linking website.networkmanagers below, as well as oauth and the justin.tv python library that I'm using.
import sys, os, socket
LOG = False
def updateStreamInfo():
# Set necessary paths
honstreams = os.path.realpath(os.path.dirname(__file__) + "../../../")
sys.path.append(honstreams)
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
# Import necessary moduels
from website.models import Stream, StreamInfo
from website.networkmanagers import get_manager, \
NetworkManagerReturnedErrorException
# Get all streams
streams = Stream.objects.all()
try:
# Loop through them
for stream in streams:
skipstream = False
print 'Checking %s...' % stream.name,
# Get the appropriate network manager and
manager = get_manager(stream.network.name)
# Try to get stream status up to 3 times
for i in xrange(3):
try:
streamOnline = manager.getStreamOnline(stream.name, LOG)
break
except socket.error as e:
code, message = e
# Retry up to 3 times
print 'Error: %s. Retrying...'
# If this stream should be skipped
if(skipstream):
print 'Can\'t connect! Skipping %s' % stream.name
continue
# Skip if status has not changed
if streamOnline == stream.online:
print 'Skipping %s because the status has not changed' % \
stream.name
continue
# Save status
stream.online = streamOnline
stream.save()
print 'Set %s to %s' % (stream.name, streamOnline)
except NetworkManagerReturnedErrorException as e:
print 'Stopped the status update loop:', e
if(__name__ == "__main__"):
if(len(sys.argv) > 1 and sys.argv[1] == "log"):
LOG = True
if(LOG): print "Logging enabled"
updateStreamInfo()
networkmanagers.py
oauth.py JtvClient.pyExample of the script freezing
foo@bar:/.../honstreams/honstreams# python website/scripts/updateStreamStatus.py
Checking angrytestie... Skipping angrytestie because the status has not changed Checking chustream... Skipping chustream because the status has not changed Checking cilantrogamer... Skipping cilantrogamer because the status has not changed | <- caret sits here blinking infinitely
Interesting update
Every time it freezes and I send a keyboard interrupt, it's on the same line in socket.py:
root@husta:/home/honstreams/honstreams# python website/scripts/updateStreamStatus.py
Checking angrytestie... Skipping angrytestie because the status has not changed
Checking chustream... Skipping chustream because the status has not changed
^CChecking cilantrogamer...
Traceback (most recent call last):
File "website/scripts/updateStreamStatus.py", line 64, in <module>
updateStr开发者_开发百科eamInfo()
File "website/scripts/updateStreamStatus.py", line 31, in updateStreamInfo
streamOnline = manager.getStreamOnline(stream.name, LOG)
File "/home/honstreams/honstreams/website/networkmanagers.py", line 47, in getStreamOnline
return self.getChannelLive(channelName, log)
File "/home/honstreams/honstreams/website/networkmanagers.py", line 65, in getChannelLive
response = client.get('/stream/list.json?channel=%s' % channelName)
File "/home/honstreams/honstreams/website/JtvClient.py", line 51, in get
return self._send_request(request, token)
File "/home/honstreams/honstreams/website/JtvClient.py", line 90, in _send_request
return conn.getresponse()
File "/usr/lib/python2.6/httplib.py", line 986, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.6/socket.py", line 397, in readline
data = recv(1)
KeyboardInterrupt
Any thoughts?
Have you tried using another application to open that connection? Given that it's an issue in production, perhaps you don't have some firewall issues.
Down in JtvClient.py it uses httplib to handle the connection. Have you tried changing this to use httplib2 instead?
Other than that stab in the dark, I would add a lot of logging statements to this code in order to track what actually happens and where it gets stuck. Then I would make sure that the point where it gets stuck can timeout on the socket (which usually involves either monkeypatching or forking the codebase) so that stuff fails instead of hanging.
You said:
I know it freezes on the line streamOnline = manager.getStreamOnline(stream.name, LOG). That's the same point where the socket.timeout exception occurs.
Wrong. It doesn't freeze on that line because that line is a function call which calls lots of other functions through several levels of other modules. So you do not yet know where the program freezes. Also, that line is NOT the point where the socket timeout occurs. The socket timeout will only occur on a low level socket operation like select or recv which is being called several times in the chain of activity triggered by getStreamOnline.
You need to trace your code in a debugger or add print statements to track down exactly where the hang occurs. It could possibly be an infinite loop in Python but is more likely to be a low-level call to an OS networking function. Until you find the source of the error, you can't do anything.
P.S. the keyboard interrupt is a reasonable clue that the problem is around line 90 in JtvClient.py, so put in some print statements and find out what happens. There may be a stupid loop in there that keeps calling getresponse, or you may be calling it with bad parameters or maybe the network server really is borked. Narrow it down to fewer possibilities.
It turns out this HTTP connection isn't passed a timeout in jtvClient.py
def _get_conn(self):
return httplib.HTTPConnection("%s:%d" % (self.host, self.port))
Changed the last line to
return httplib.HTTPConnection("%s:%d" % (self.host, self.port), timeout=10)
Which solved it
精彩评论