开发者

Need some assistance with Python threading/queue

import threading
import Queue
import urllib2
import time

class ThreadURL(threading.Thread):

    def __init__(self, queue):
        threading.Thread.__init__(self)

        self.queue = queue

    def run(self):
        while True:
            host = self.queue.get()
            sock = urllib2.urlopen(host)
            data = sock.read()

            self.queue.task_done()

hosts = ['http://www.google.com', 'http://www.yahoo.com', 'http://www.facebook.com', 'http://stackoverflow.com']
start = time.time()

def main():
    queue = Queue.Queue()

    for i in range(len(hosts)):
        t = ThreadURL(queue)
        t.start()

    for host in hosts:
        queue.put(host)

    queue.join()

if __name__ == '__main__':
    main()
    print 'Elapsed time: {0}'.format(time.time() - start)

I've been trying to get my head around how to perform Threading and after a few tutorials, I've come up with the above.

What it's supposed to do is:

  1. I开发者_如何学JAVAnitialiase the queue
  2. Create my Thread pool and then queue up the list of hosts
  3. My ThreadURL class should then begin work once a host is in the queue and read the website data
  4. The program should finish

What I want to know first off is, am I doing this correctly? Is this the best way to handle threads?

Secondly, my program fails to exit. It prints out the Elapsed time line and then hangs there. I have to kill my terminal for it to go away. I'm assuming this is due to my incorrect use of queue.join() ?


Your code looks fine and is quite clean.

The reason your application still "hangs" is that the worker threads are still running, waiting for the main application to put something in the queue, even though your main thread is finished.

The simplest way to fix this is to mark the threads as daemons, by doing t.daemon = True before your call to start. This way, the threads will not block the program stopping.


looks fine. yann is right about the daemon suggestion. that will fix your hang. my only question is why use the queue at all? you're not doing any cross thread communication, so it seems like you could just send the host info as an arg to ThreadURL init() and drop the queue.

nothing wrong with it, just wondering.


One thing, in the thread run function, the while True loop, if some exception happened, the task_done() may not be called however the get() has already been called. Thus the queue.join() may never end.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜