How can I speed up array generations in python?
I'm thinking I need to use numpy or some other library to fill these a开发者_开发技巧rrays fast enough but I don't know much about it. Right now this operation takes about 1 second on a quad-core Intel PC, but I need it to be as fast as possible. Any help is greatly appreciated. Thanks!
import cv
class TestClass:
def __init__(self):
w = 960
h = 540
self.offx = cv.CreateMat(h, w, cv.CV_32FC1)
self.offy = cv.CreateMat(h, w, cv.CV_32FC1)
for y in range(h):
for x in range(w):
self.offx[y,x] = x
self.offy[y,x] = y
My eight year old (slow) computer is able to create a list of lists the same size as your matrix in 127 milliseconds.
C:\Documents and Settings\gdk\Desktop>python -m timeit "[[x for x in range(960)]
for y in range(540)]"
10 loops, best of 3: 127 msec per loop
I don't know what the cv module is and how it creates matrices. But maybe this is the cause of the slow code.
Numpy may be faster. Creating an array of (python int
) 1s:
C:\Documents and Settings\gdk\Desktop>python -m timeit -s "from numpy import one
s" "ones((960, 540), int)"
100 loops, best of 3: 6.54 msec per loop
You can compare the timings for creating matrices using different modules to see if there is a benefit to changing: timeit module
You're generating a half million integers and creating over a million references while you're at it. I'd just be happy it only takes 1 second.
If you're doing this a lot, you should think about ways to cache the results.
Also, being on a quad-core anything doesn't help in a case like this, you're performing a serial operation that can only execute on one core at a time (and even if you threaded it, CPython can only be executing one pure-Python thread at a time due to the Global Interpreter Lock).
The code in Numpy that does exactly what you did in OpenCV python is
import numpy as np
offsetx, offsety = np.meshgrid(range(960),range(540))
If you are using Python, consider learning the different functions of numpy will help you tremendously. OpenCV functions can work directly with numpy arrays as well. The syntax of numpy in Python is much better than OpenCV though.
Here is are the times of the two versions in my i7
time python test.py
real 0m0.654s
user 0m0.640s
sys 0m0.010s
My version:
time python test2.py
real 0m0.075s
user 0m0.060s
sys 0m0.020s
If you are creating the same matrix over and over, it may be faster to initialise it using cv.SetData()
Well, you can atleast use xrange instead of range. range creates an entire list of all those numbers. xrange generates them 1 by 1. Since you're only using them one at a time, you don't need a list of them.
I didn't fully understand what you were trying to achieve. But here are two concrete examples and benchmarks which might help you. They both do the same thing, fill 960x540 image(array) with red.
slow.py uses for loops to fill array
import cv2
import numpy as np
width, height = 960, 540
image = np.zeros((height, width, 3), np.uint8)
# Fill array with red
for y in range(height):
for x in range(width):
image[y, x] = (0, 0, 255)
cv2.imwrite('red.jpg', image)
Running time
$ time python slow.py
real 0m2.240s
user 0m2.172s
sys 0m0.040s
fast.py uses numpy to fill array
import cv2
import numpy as np
width, height = 960, 540
image = np.zeros((height, width, 3), np.uint8)
# Fill array with red
image[:] = (0, 0, 255)
cv2.imwrite('red.jpg', image)
Running time
$ time python fast.py
real 0m0.134s
user 0m0.084s
sys 0m0.024s
Using numpy instead of for loops is almost 17x faster
精彩评论