开发者

Python data structure for sparse list of x/y co-ordinates

Consider list of x/y co-ordinates and a byte 'count'. x/y will have a range of perhaps 0 to 5000 which is 25 million cells.

However the data will be quite sparsely populated, there will be at most a few thousand entries and the majority of co-ordinates will have zero entries.

The structure will be occasionally looked up/added to (e.g. if there is something in x=5 and y=10 then ++) but more frequently converted into a list of x/y/count (sorting is not important)

The fastest data 开发者_运维百科structure is for lookup is obviously a 2d array, but you're looking at 24 MB of memory or so and the iteration to output a list could be expensive. For disk storage you could implement gif style compression where a 0 byte followed by another byte indicates x empty cells and anything else is a cell value - but this doesn't help the memory situation.

A dictionary of dictionary's would probably be a good balance between lookup/iteration speed and memory usage.

Are there any other suitable data structures I should be considering (either built in to Python, existing libraries or more general data structures?


A dictionary keyed by a point (ie a 2-tuple) sound good to me. It's O(1) like an array, and significantly more compact. As long as you never need to do range queries or the like, it should be fine.

# increment
p = (x, y)
counts[p] = counts.get(p, 0) + 1

# list
for (p, count) in counts.iteritems():
    x, y = p
    print x, y, count


scipy has a range of different sparse arrays

There are seven available sparse matrix types:
csc_matrix: Compressed Sparse Column format
csr_matrix: Compressed Sparse Row format
bsr_matrix: Block Sparse Row format
lil_matrix: List of Lists format
dok_matrix: Dictionary of Keys format
coo_matrix: COOrdinate format (aka IJV, triplet format)
dia_matrix: DIAgonal format


This should be similar to working with sparse matrices of the size of your data range, there's plenty of stuff to chew on here http://en.wikipedia.org/wiki/Sparse_matrix

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜