Process RGBA data efficiently using python?
I'm trying to process an RGBA buffer (list of chars), and run "unpremultiply" on each pixel. The algorithm is color_out=color*255/alpha
.
This is what I came up with:
def rgba_unpremultiply(data):
for i in range(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
data[i] = chr(255*ord(data[i])/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
It works but causes a major drawback in performance.
I'm wondering besides writing a C module, wh开发者_StackOverflow社区at are my options to optimize this particular function?
This is exactly the kind of code NumPy is great for.
import numpy
def rgba_unpremultiply(data):
a = numpy.fromstring(data, 'B') # Treat the string as an array of bytes
a = a.astype('I') # Cast array of bytes to array of uints, since temporary values needs to be larger than byte
alpha = a[3::4] # Every 4th element starting from index 3
alpha = numpy.where(alpha == 0, 255, alpha) # Don't modify colors where alpha is 0
a[0::4] = a[0::4] * 255 // alpha # Operates on entire slices of the array instead of looping over each element
a[1::4] = a[1::4] * 255 // alpha
a[2::4] = a[2::4] * 255 // alpha
return a.astype('B').tostring() # Cast back to bytes
How big is data? Assuming this is on python2.X Try using xrange instead of range so that you don't have to constantly allocate and reallocate a large list.
You could convert all the data to integers for working with them so you're not constantly converting to and from characters.
Look into using numpy to vectorize this: Link I suspect that simply storing the data as integers and using a numpy array will greatly improve the performance.
And another relatively simple thing you could do is write a little Cython:
http://wiki.cython.org/examples/mandelbrot
Basically Cython will compile your above function into C code with just a few lines of type hints. It greatly reduces the barrier to writing a C extension.
I don't have a concrete answer, but some useful pointers might be:
- Python's array module
- numpy
- OpenCV if you have actual image data
There are some minor things you can do, but I do not think you can improve a lot.
Anyway, here's some hint:
def rgba_unpremultiply(data):
# xrange() is more performant then range, it does not precalculate the whole array
for i in xrange(0, len(data), 4):
a = ord(data[i+3])
if a != 0:
# Not sure about this, but maybe (c << 8) - c is faster than c*255
# So maybe you can arrange this to do that
# Check for performance improvement
data[i] = chr(((ord(data[i]) << 8) - ord(data[i]))/a)
data[i+1] = chr(255*ord(data[i+1])/a)
data[i+2] = chr(255*ord(data[i+2])/a)
return data
I've just make some dummy benchmark on << vs *, and it seems not to be markable differences, but I guess you can do better evaluation on your project.
Anyway, a c module may be a good thing, even if it does not seem to be "language related" the problem.
精彩评论