How to improve the performance?
I had prepared a project on making a software application. It is complete and working fine except that the speed of execution is very slow.. I have taken several chunks of code and optimized it..
I tried psyco.. ie I installed psyco and added two lines on the top of my code
import psyco
psyco.full()
Don't know whether this is the way using psyco.. if this is wrong. Please tell me how to use psyco.. because I added this and found no improvement..
I have tried profiling and I know the code lines taking time but these can't be further optimized and are unavoidable line of code..
I also thought of option of rewriting the code in 'c' using some python package.. but I always had a very bad experience in using additional package of python which are not part of basic python..
I am using python 2.6 and wind开发者_如何学Pythonows vista.. please kindly tell methods method for increasing the speed of execution of the whole code significantly.. at least 5x times.. please..
I haven't written my code in method, there few method in between thou.. there is no main..
Yes as few suggested my is an IO bound problem.. as I need to call the code some 500 times and this involves opening and closing of files of at least 2 per call..
And here when opening a .pm file, it has two columns and I need the first columns only, so I am copying the entire first columns into the list and passing it to a function to get its row number and then opening other file to get the elements of that row number into a list...
This is the task I wanted... I guess loading the elements of first columns into the list is time consuming any idea to rectify this..
How can I improve the performance for IO bound bottlenecks
Looking for help desperately
You could get a lot better performance if you could switch to binary file formats. Most of your code is doing parsing and string manipulation. You're doing a lot of converting strings to floats, which is slower than you think.
You are unlikely to see a 5x performance difference by just tweaking the code around.
First you should look at improving your algorithm - are you using the best datastructures for the job? Perhaps using a dict
or a set
in the right place can speed your code up a alot.
Writing a C module is not all that hard, and is another option if you can find no way to improve the Python code. Usually you would expect more than a 5x speed up by using C code.
Maybe your problem is IO bound. Then you need to look at ways to improve the performance of the IO
If you want more help here, you'll probably have to show some code or at least describe what your program does.
UPDATE: Looks like you are opening and closing lots of files which tends to be painfully slow on windows.
psyco can be used as simple as import and call psyco.full(). so you are right about your psyco usage.
If you are trying to build a python module using C/C++, have a look at boost::python
You should really post your code for further analysis.
To optimize your code for speed you simply have to profile it and see where the problem is. Guessing does not help. But once you know where, the most bang for your buck usually come from those in descending order: improving algorithm, using more appropriate data structures, removing resource bottlenecks (io,memory,cpu), reducing memory allocation, reducing context switching (processes and subroutines).
Here's one opportunity for optimization: you're calling get_list twice, with very similar arguments:
join_cost_index_end[index] = get_list(file, float(abs1), fout)
join_cost_index_strt[index] = get_list(file, float(abs2), fout)
That means that most of the work in get_list is being done twice for no good reason. Rewrite it so that get_list is being called once, and have it return both index_end and index_strt at the same time.
why bot just try using cython? You should get much better performance without changing any of the code. With a little bit of modification this should help even more.
精彩评论