multithreading performance issue [duplicate]
Possible Duplicate: Multithreading performance overhead
I have a piece of code that run slower on multithread and faster while using one thread.
Output from one thread:
Batc开发者_Python百科h 0 finished in 0.0970576110595
Batch 1 finished in 0.712632355587
Batch 2 finished in 2.16707853982
Batch 3 finished in 5.13259954359
Batch 4 finished in 9.54205263734
Total running time is approx 17second
Output using multi-thread
Thread 0 finished in 60.4911733611
Thread 1 finished in 62.5297083217
Thread 2 finished in 65.5614617838
Thread 3 finished in 66.8199233683
Thread 4 finished in 66.8426577103
Total running time is 66 second.
What is being done in each process is to take a 100 lines of text, split into tokens, remove stopwords and generate some patterns from it using some algorithm, does anyone have any experience or ways to help me to identify what went wrong?
@mac's answer probably covers the why, but we'll need some of your code to give any more help. Something you may want to try, it to use multiprocessing instead of threading, but then you have data access issues...
Threading in python is only any good on non-blocking IO as you only have one active thread per python process.
Multiprocessing is useful for making use of extra cores, but then you start to have to worry about locks and race conditions and everything else.
精彩评论