Python Profile Pitfalls
I am a beginner just starting to profile my code and was confus开发者_运维技巧ed why the elapsed time given by cProfile was so off from the time given by using time.time().
# Python 2.7.2
import cProfile
def f(n):
G = (i for i in xrange(n))
sum = 0
for i in G:
sum += i
num = 10**6
cProfile.run('f(num)')
This gives 1000004 function calls in 2.648 seconds
Yet with time.time(), I get 0.218000173569 seconds
import time
x = time.time()
f(num)
print time.time() - x
From what I have read, I guess this may be because of the overhead of cProfile. Are there any general tips for when cProfile timing is likely to be very off, or ways to get more accurate timing?
The point of profiling is to find out what parts of your program are taking the most time, and thus need the most attention. If 90% of the time is being used by one function, you should be looking there to see how you can make that function more efficient. It doesn't matter whether the entire run takes 10 seconds or 1000.
Perhaps the most important piece of information the profiler gives you is how many times something is called. Why this is useful is that it helps you find places where you are calling things unnecessarily often, especially if you have nested loops, or many functions that call other functions. The profiler helps you track this stuff down.
The profiling overhead is unavoidable, and large. But it is much easier to let the profiler do what it does than to insert your own timings and print statements all over the place.
Note that cProfile gives you CPU time, but Using time.time() gives you elapsed time (which isn't what you want).
maybe you can try using the unix time program.
➜ sandbox /usr/bin/time -p python profiler.py
real 0.17
user 0.14
sys 0.01
The CPU time should be user+sys
精彩评论