开发者

Python script freezes infinitely during a socket connection

I have a simple python script that updates that statuses of justin.tv streams in my database. It's a Django based web application. This script worked perfectly before I moved it to my production server, but now it has issues with timing out or freezing. I've solved the time out problem by adding try/except blocks and making the script retry, but I still can't figure out the freezing problem.

I know it freezes on the line streamOnline = manager.getStreamOnline(stream.name, LOG). That's the same point where the socket.timeout exception occurs. Some times however, it just locks up for ever. I just can't picture a scenario where python would freeze infinitely. Here is the code for the script that freezes. I'm linking website.networkmanagers below, as well as oauth and the justin.tv python library that I'm using.

import sys, os, socket

LOG = False

def updateStreamInfo():
    # Set necessary paths
    honstreams = os.path.realpath(os.path.dirname(__file__) + "../../../")
    sys.path.append(honstreams)
    os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'

    # Import necessary moduels
    from website.models import Stream, StreamInfo
    from website.networkmanagers import get_manager, \
                                        NetworkManagerReturnedErrorException

    # Get all streams
    streams = Stream.objects.all()

    try:
        # Loop through them
        for stream in streams:

            skipstream = False

            print 'Checking %s...' % stream.name,
            # Get the appropriate network manager and
            manager = get_manager(stream.network.name)

            # Try to get stream status up to 3 times
            for i in xrange(3):
                try:
                    streamOnline = manager.getStreamOnline(stream.name, LOG)
                    break
                except socket.error as e:
                    code, message = e

                    # Retry up to 3 times
                    print 'Error: %s. Retrying...'

            # If this stream should be skipped
            if(skipstream):
                print 'Can\'t connect! Skipping %s' % stream.name
                continue

            # Skip if status has not changed
            if streamOnline == stream.online:
                print 'Skipping %s because the status has not changed' % \
                      stream.name
                continue

            # Save status
            stream.online = streamOnline
            stream.save()

            print 'Set %s to %s' % (stream.name, streamOnline)

    except NetworkManagerReturnedErrorException as e:
        print 'Stopped the status update loop:', e

if(__name__ == "__main__"):
    if(len(sys.argv) > 1 and sys.argv[1] == "log"):
        LOG = True

    if(LOG): print "Logging enabled"

    updateStreamInfo()

networkmanagers.py

oauth.py

JtvClient.py

Example of the script freezing

foo@bar:/.../honstreams/honstreams# python website/scripts/updateStreamStatus.py

Checking angrytestie... Skipping angrytestie because the status has not changed

Checking chustream... Skipping chustream because the status has not changed

Checking cilantrogamer... Skipping cilantrogamer because the status has not changed

| <- caret sits here blinking infinitely


Interesting update

Every time it freezes and I send a keyboard interrupt, it's on the same line in socket.py:

root@husta:/home/honstreams/honstreams# python website/scripts/updateStreamStatus.py
Checking angrytestie... Skipping angrytestie because the status has not changed
Checking chustream... Skipping chustream because the status has not changed
^CChecking cilantrogamer...
Traceback (most recent call last):
  File "website/scripts/updateStreamStatus.py", line 64, in <module>
    updateStr开发者_开发百科eamInfo()
  File "website/scripts/updateStreamStatus.py", line 31, in updateStreamInfo
    streamOnline = manager.getStreamOnline(stream.name, LOG)
  File "/home/honstreams/honstreams/website/networkmanagers.py", line 47, in getStreamOnline
    return self.getChannelLive(channelName, log)
  File "/home/honstreams/honstreams/website/networkmanagers.py", line 65, in getChannelLive
    response = client.get('/stream/list.json?channel=%s' % channelName)
  File "/home/honstreams/honstreams/website/JtvClient.py", line 51, in get
    return self._send_request(request, token)
  File "/home/honstreams/honstreams/website/JtvClient.py", line 90, in _send_request
    return conn.getresponse()
  File "/usr/lib/python2.6/httplib.py", line 986, in getresponse
    response.begin()
  File "/usr/lib/python2.6/httplib.py", line 391, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.6/socket.py", line 397, in readline
    data = recv(1)
KeyboardInterrupt

Any thoughts?


Have you tried using another application to open that connection? Given that it's an issue in production, perhaps you don't have some firewall issues.


Down in JtvClient.py it uses httplib to handle the connection. Have you tried changing this to use httplib2 instead?

Other than that stab in the dark, I would add a lot of logging statements to this code in order to track what actually happens and where it gets stuck. Then I would make sure that the point where it gets stuck can timeout on the socket (which usually involves either monkeypatching or forking the codebase) so that stuff fails instead of hanging.

You said:

I know it freezes on the line streamOnline = manager.getStreamOnline(stream.name, LOG). That's the same point where the socket.timeout exception occurs.

Wrong. It doesn't freeze on that line because that line is a function call which calls lots of other functions through several levels of other modules. So you do not yet know where the program freezes. Also, that line is NOT the point where the socket timeout occurs. The socket timeout will only occur on a low level socket operation like select or recv which is being called several times in the chain of activity triggered by getStreamOnline.

You need to trace your code in a debugger or add print statements to track down exactly where the hang occurs. It could possibly be an infinite loop in Python but is more likely to be a low-level call to an OS networking function. Until you find the source of the error, you can't do anything.

P.S. the keyboard interrupt is a reasonable clue that the problem is around line 90 in JtvClient.py, so put in some print statements and find out what happens. There may be a stupid loop in there that keeps calling getresponse, or you may be calling it with bad parameters or maybe the network server really is borked. Narrow it down to fewer possibilities.


It turns out this HTTP connection isn't passed a timeout in jtvClient.py

def _get_conn(self):
    return httplib.HTTPConnection("%s:%d" % (self.host, self.port))

Changed the last line to

return httplib.HTTPConnection("%s:%d" % (self.host, self.port), timeout=10)

Which solved it

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜