What is the fastest way in python to build a c array from a list of tuples of floats?

2023-01-24 09:27 问答作者：

The context: my Python code pass arrays of 2D vertices to OpenGL.

I tested 2 approaches, one with ctypes, the other with struct, the latter being more than twice faster.

from random import random
points = [(rand开发者_StackOverflow中文版om(), random()) for _ in xrange(1000)]

from ctypes import c_float
def array_ctypes(points):
    n = len(points)
    return n, (c_float*(2*n))(*[u for point in points for u in point])

from struct import pack
def array_struct(points):
    n = len(points)
    return n, pack("f"*2*n, *[u for point in points for u in point])

Any other alternative? Any hint on how to accelerate such code (and yes, this is one bottleneck of my code)?

You can pass numpy arrays to PyOpenGL without incurring any overhead. (The data attribute of the numpy array is a buffer that points to the underlying C data structure that contains the same information as the array you're building)

import numpy as np  
def array_numpy(points):
    n = len(points)
    return n, np.array(points, dtype=np.float32)

On my computer, this is about 40% faster than the struct-based approach.

You could try Cython. For me, this gives:

function       usec per loop:
               Python  Cython
array_ctypes   1370    1220
array_struct    384     249
array_numpy     336     339

So Numpy only gives 15% benefit on my hardware (old laptop running WindowsXP), whereas Cython gives about 35% (without any extra dependency in your distributed code).

If you can loosen your requirement that each point is a tuple of floats, and simply make 'points' a flattened list of floats:

def array_struct_flat(points):
    n = len(points)
    return pack(
        "f"*n,
        *[
            coord
            for coord in points
        ]
    )

points = [random() for _ in xrange(1000 * 2)]

then the resulting output is the same, but the timing goes down further:

function            usec per loop:
                    Python  Cython
array_struct_flat           157

Cython might be capable of substantially better than this too, if someone smarter than me wanted to add static type declarations to the code. (Running 'cython -a test.pyx' is invaluable for this, it produces an html file showing you where the slowest (yellow) plain Python is in your code, versus python that has been converted to pure C (white). That's why I spread the code above out onto so many lines, because the coloring is done per-line, so it helps to spread it out like that.)

Full Cython instructions are here: http://docs.cython.org/src/quickstart/build.html

Cython might produce similar performance benefits across your whole codebase, and in ideal conditions, with proper static typing applied, can improve speed by factors of ten or a hundred.

There's another idea I stumbled across. I don't have time to profile it right now, but in case someone else does:

 # untested, but I'm fairly confident it runs
 # using 'flattened points' list, i.e. a list of n*2 floats
 points = [random() for _ in xrange(1000 * 2)]
 c_array = c_float * len(points * 2)
 c_array[:] = points

That is, first we create the ctypes array but don't populate it. Then we populate it using the slice notation. People smarter than I tell me that assigning to a slice like this may help performance. It allows us to pass a list or iterable directly on the RHS of the assignment, without having to use the *iterable syntax, which would perform some intermediate wrangling of the iterable. I suspect that this is what happens in the depths of creating pyglet's Batches.

Presumably you could just create c_array once, then just reassign to it (the final line in the above code) every time the points list changes.

There is probably an alternative formulation which accepts the original definition of points (a list of (x,y) tuples.) Something like this:

 # very untested, likely contains errors
 # using a list of n tuples of two floats
 points = [(random(), random()) for _ in xrange(1000)]
 c_array = c_float * len(points * 2)
 c_array[:] = chain(p for p in points)

If performance is an issue, you do not want to use ctypes arrays with the star operation (e.g., (ctypes.c_float * size)(*t)).

In my test packis fastest followed by the use of the array module with a cast of the address (or using the from_buffer function).

import timeit
repeat = 100
setup="from struct import pack; from random import random; import numpy;  from array import array; import ctypes; t = [random() for _ in range(2* 1000)];"
print(timeit.timeit(stmt="v = array('f',t); addr, count = v.buffer_info();x = ctypes.cast(addr,ctypes.POINTER(ctypes.c_float))",setup=setup,number=repeat))
print(timeit.timeit(stmt="v = array('f',t);a = (ctypes.c_float * len(v)).from_buffer(v)",setup=setup,number=repeat))
print(timeit.timeit(stmt='x = (ctypes.c_float * len(t))(*t)',setup=setup,number=repeat))
print(timeit.timeit(stmt="x = pack('f'*len(t), *t);",setup=setup,number=repeat))
print(timeit.timeit(stmt='x = (ctypes.c_float * len(t))(); x[:] = t',setup=setup,number=repeat))
print(timeit.timeit(stmt='x = numpy.array(t,numpy.float32).data',setup=setup,number=repeat))

The array.array approach is slightly faster than Jonathan Hartley's approach in my test while the numpy approach has about half the speed:

python3 convert.py
0.004665990360081196
0.004661010578274727
0.026358536444604397
0.0028003649786114693
0.005843495950102806
0.009067213162779808

The net winner is pack.

You can use array (notice also the generator expression instead of the list comprehension):

array("f", (u for point in points for u in point)).tostring()

Another optimization would be to keep the points flattened from the beginning.

继续阅读：ctypes pyopengl python

What is the fastest way in python to build a c array from a list of tuples of floats?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？