Find the closest vector
Recently I wrote the algorithm to quantize an RGB image. Every pixel is represented by an (R,G,B) vector, and quantization codebook is a couple of 3-dimensional vectors. Every pixel of the image needs to be mapped to (say, "replaced by") the codebook pixel closest in terms of euclidean distance (more exactly, squared euclidean). I did it as follows:
class EuclideanMetric(DistanceMetric):
def __call__(self, x, y):
d = x - y
return sqrt(sum(d * d, -1))
class Quantizer(object):
def __init__(self, codebook, distanceMetric = EuclideanMetric()):
self._codebook = codebook
self._distMetric = distanceMetric
def quantize(开发者_JAVA技巧self, imageArray):
quantizedRaster = zeros(imageArray.shape)
X = quantizedRaster.shape[0]
Y = quantizedRaster.shape[1]
for i in xrange(0, X):
print i
for j in xrange(0, Y):
dist = self._distMetric(imageArray[i,j], self._codebook)
code = argmin(dist)
quantizedRaster[i,j] = self._codebook[code]
return quantizedRaster
...and it works awfully, almost 800 seconds on my Pentium Core Duo 2.2 GHz, 4 Gigs of memory and an image of 2600*2700 pixels:(
Is there a way to somewhat optimize this? Maybe the other algorithm or some Python-specific optimizations.
UPD: I tried to use the squared euclidean and still get an enormous time.
One simple optimization is to drop the sqrt
call. x is monotonic with sqrt(x), and since you don't need the actual distance, just the min distance, use x^2 instead. Should help a bit since sqrt is expensive.
This trick is used a lot when working with distances. For instance, if you have a distance threshold, you can use threshold^2 and drop the sqrt in the distance calculation. Really, the sqrt is only necessary when absolute distance is needed. For relative distances, drop the sqrt.
Update: an algorithmic change is probably needed then. Right now you are comparing every codebook vector to every pixel. It would speed things up to reduce the number of distance calculations.
You might do better using a kd-tree for this, which will reduce the search for each pixel from O(codebook) to O(log(codebook)). I've never done this in python, but some googling gave an implementation that might work here.
You could use the vector quantization function vq
from scipy.cluster.vq
.
If X
is very large, you're printing i
quite a lot, which can really hurt performance. For a less specific answer, read on.
To find out where the bottleneck in your process is, I suggest a timing decorator, something along the lines of
from functools import wraps
import time
def time_this(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
finish = time.time()
elapsed = (finish - start) * 1000
print '{0}: {1} ms'.format(func.__name__, elapsed)
return result
return wrapper
I found this somewhere once upon a time and have always used it to figure out where my code is slow. You can break your algorithm down into a series of separate functions, then decorate the function with this decorator to see how long each function call takes. Then it's a matter of fiddling with which statements are in which functions to see what improves how long your decorated functions take to run. Mainly you're looking for two things: 1) statements that take a long time to execute, or 2) statements that do not necessarily take long to execute, but that are executed so many times that a very small improvement in performance will have a large effect on the overall performance.
Good luck!
精彩评论