Python: threading + lock slows my app down considerably
Say I have a function that writes to a file. I also have a function that loops repeatedly reading from said file. I have both of these functions running in separate threads. (Actually I am reading/writing to registers via MDIO which is why I can't have both threads executing concurrently, only one or the other, but for the sake of simplicity, let's just say it's a file)
Now when I run the write function in isolation, it executes fairly quickly. However when I'm running threaded and have it acquire a lock before running, it seems to run extremely slow. Is this because the second thread (read function) is polling to acquire the lock? Is there any way to get around this?
I am currently just using a simple RLock, but am open to any change that would increase performance.
Edit: As an example, I will put a basic example of what's going on. The read thread is basically always running, but occasionally a separate thread will make a call to load. If I benchmark by running load from cmd prompt, running in a thread is at least 3x slower.
write thread:
import usbmpc # functions I made which access dll functions for hardware, etc
def load(self, lock):
lock.acquire()
f = open('file.txt','r')
data = f.readlines()
for x in data:
usbmpc.write(x)
lock.release()
read thread:
import usbmpc
def read(self, lock):
addr = START_ADDR
while True:
lock.开发者_开发技巧acquire()
data = usbmpc.read(addr)
lock.release()
addr += 4
if addr > BUF_SIZE: addr = START_ADDR
Do you use threading on a multicore machine?
If the answer is yes, then unless your Python version is 3.2+ you are going to suffer reduced performance when running threaded applications.
David Beazly has put considerable effort to find what is going on with GIL on multicores and has made it easy for the rest of us to understand it too. Check his website and the resources there. Also you might want to see his presentation at PyCon 2010. It is rather intresting.
To make a long story short, in Python 3.2, Antoine Pitrou wrote a new GIL that has the same performance on single and multicore machines. In previous versions, the more cores/threads you have, the performance loss increases...
hope it helps :)
Why aren't you acquiring the lock in the writer for the duration of each write only? You're currently locking for the entire duration of the load function, the reader never gets in until the load function is completely done.
Secondly, you should be using context locks. Your current code is not thread safe:
def load(lock):
for x in data:
with lock:
whatever.write(x)
The same goes for your reader. Use a context to hold the lock.
Thirdly, don't use an RLock
. You know you don't need one, at no point does your read/write code need to reacquire, so don't give it that opportunity, you will be masking bugs.
The real answer is in several of the comments to your question: The GIL is causing some contention (assuming it isn't actually your misuse of locking). The Python threading
module is fantastic, the GIL sometimes is not, but moreso, the complex behaviours it generates that are misunderstood. It's worth mentioning though that the mainstream belief that throwing threads at problems is not the panacea people believe it to be. It usually isn't the solution.
精彩评论