Optimize Game-of-Life iteration over 80x60 RGB pixel array
Okay, so I've got a piece of Python code which really needs optimizing.
- It's a Game-of-Life iteration over a small (80x60-pixel) image and extracts the RGB values from it.
- currently using nested for-loops; I'd rather swap out those for loops for the faster
map()
c function, but if I do that I can't figure out how I can get the x,y values, nor the local values defined out of the scope of the functions I'd need to define. - would using
map()
be any faster than this current set of for loops? How could I use it and still get x,y? - I currently use pygame Surfaces, and I've tried the
surfarray/pixelarray
modules, but since I'm changing/getting every pixel, it's a lot slower thanSurface.get_at()/set_at()
. - Also, slightly irrelevant... do you think this could be made quicker if Python wasn't traversing a list of numbers but just incrementing a number, like in other languages? Why doesn't python include a normal for() as well as their foreach()?
- The amount of conditionals there probably makes things slower too, right? The slowest part is checking for neighbours (where it builds the list n)... I replaced that whole bit with slice access on a 2D array but it doesn't work properly.
Redacted version of code:
xr = xrange(80)
yr = xrange(60)
# surface is an instance of pygame.Surface
get_at = surface.get_at()
set_at = surface.set_at()
for x in xr:
# ....
for y in yr:
# ...
pixelR = get_at((x,y))[0]
pixelG = get_at((x,y))[1]
pixelB = get_at((x,y))[2]
# ... more complex stuff here which changes R,G,B values independently of each other
set_at((x,y),(pixelR,pixelG,pixelB))
Full version of the function:
# xr, yr = xrange(80), xrange(60)
def live(surface,xr,yr):
randint = random.randint
set_at = surface.set_at
get_at = surface.get_at
perfect = perfectNeighbours #
minN = minNeighbours # All global variables that're defined in a config file.
maxN = maxNeighbours #
pos = actual # actual = (80,60)
n = []
append = n.append
NEIGHBOURS = 0
for y in yr: # going height-first for aesthetic reasons.
decay = randint(1,maxDecay)
growth = randint(1,maxGrowth)
for x in xr:
r, g, b, a = get_at((x,y))
del n[:]
NEIGHBOURS = 0
if x>0 and y>0 and x<pos[0]-1 and y<pos[1]-1:
append(get_at((x-1,y-1))[1])
append(get_at((x+1,y-1))[1])
append(get_at((x,y-1))[1])
append(get_at((x-1,y))[1])
append(get_at((x+1,y))[1])
append(get_at((x-1,y+1))[1])
append(get_at((x+1,y+1))[1])
append(get_at((x,y+1))[1])
for a in n:
if a > 63:
NEIGHBOURS += 1
if NEIGHBOURS == 0 and (r,g,b) == (0,0,0): pass
else:
if NEIGHBOURS < minN or NEIGHBOURS > maxN:
g = 0
b = 0
elif NEIGHBOURS==perfect:
g += growth
if g > 255:
g = 255
b += growth
if b > growth: b = growth
else:
if g > 10: r = g-10
if g > 200: b = g-100
if r > growth: g = r
g -= decay
if g < 0:
开发者_运维百科 g = 0
b = 0
r -= 1
if r < 0:
r = 0
set_at((x,y),(r,g,b))
What's making your code slow is probably not the loops, they are incredibly fast.
What slows done your code are the number of function calls. For example
pixelR = get_at((x,y))[0]
pixelG = get_at((x,y))[1]
pixelB = get_at((x,y))[2]
is a lot slower than (about 3 times I guess)
r, g, b, a = get_at((x,y))
Every get_at
, set_at
call locks the surface, therefore it's faster to directly access the pixels using the available methods. The one that seems most reasonable is Surface.get_buffer
.
Using map
doesn't work in your example, because you need the indexes. With as few as 80 and 60 numbers it might even be faster to use range()
instead of xrange()
.
map(do_stuff, ((x, y) for x in xrange(80) for y in xrange(60)))
where do_stuff
would presumably be defined like so:
def do_stuff(coords):
r, g, b, a = get_at(coords)
# ... whatever you need to do with those ...
set_at(coords, (r, g, b))
You could alternatively use a list comprehension instead of a generator expression as the second argument to map
(replace ((x, y) ...)
with [(x, y) ...]
) and use range
instead of xrange
. I'd say that it's not very likely to have a significant effect on performance, though.
Edit: Note that gs is certainly right about the for
loops not being the main thing in need of optimisation in your code... Cutting down on superfluous calls to get_at
is more important. In fact, I'm not sure if replacing the loops with map
will actually improve performance here at all... Having said that, I find the map
version more readable (perhaps because of my FP background...), so here you go anyway. ;-)
Since you are reading and rewriting every pixel, I think you can get the best speed improvement by not using a Surface
.
I suggest first taking your 80x60 image and converting it to a plain bitmap file with 32-bit pixels. Then read the pixel data into a python array
object. Now you can walk over the array
object, reading values, calculating new values, and poking the new values into place with maximum speed. When done, save your new bitmap image, and then convert it to a Surface
.
You could also use 24-bit pixels, but that should be slower. 32-bit pixels means one pixel is one 32-bit integer value, which makes the array of pixels much easier to index. 24-bit packed pixels means each pixel is 3 bytes, which is much more annoying to index into.
I believe you will gain much more speed out of this approach than by trying to avoid the use of for
. If you try this, please post something here to let us know how well it worked or didn't. Good luck.
EDIT: I thought that an array
has only a single index. I'm not sure how you managed to get two indexes to work. I was expecting you to do something like this:
def __i(x, y):
assert(0 <= x < 80)
assert(0 <= y < 60)
i = (y*80 + x) * 4
return i
def red(x, y):
return __a[__i(x, y)]
def green(x, y):
return __a[__i(x, y) + 1]
def blue(x, y):
return __a[__i(x, y) + 2]
def rgb(x, y):
i = __i(x, y)
return __a[i], __a[i + 1], __a[i + 2]
def set_rgb(x, y, r, g, b):
i = __i(x, y)
_a[i] = r
_a[i + 1] = g
_a[i + 2] = b
# example:
r, g, b = rgb(23, 33)
Since a Python array
can only hold a single type, you will want to set the type to "unsigned byte" and then index like I showed.
Where of course __a
is the actual array
variable.
If none of this is helpful, try converting your bitmap into a list, or perhaps three lists. You can use nested lists to get 2D addressing.
I hope this helps. If it is not helpful, then I am not understanding what you are doing; if you explain more I'll try to improve the answer.
精彩评论