Python data structure for sparse list of x/y co-ordinates

2023-03-06 09:26 问答作者：

Consider list of x/y co-ordinates and a byte 'count'. x/y will have a range of perhaps 0 to 5000 which is 25 million cells.

However the data will be quite sparsely populated, there will be at most a few thousand entries and the majority of co-ordinates will have zero entries.

The structure will be occasionally looked up/added to (e.g. if there is something in x=5 and y=10 then ++) but more frequently converted into a list of x/y/count (sorting is not important)

The fastest data 开发者_运维百科structure is for lookup is obviously a 2d array, but you're looking at 24 MB of memory or so and the iteration to output a list could be expensive. For disk storage you could implement gif style compression where a 0 byte followed by another byte indicates x empty cells and anything else is a cell value - but this doesn't help the memory situation.

A dictionary of dictionary's would probably be a good balance between lookup/iteration speed and memory usage.

Are there any other suitable data structures I should be considering (either built in to Python, existing libraries or more general data structures?

A dictionary keyed by a point (ie a 2-tuple) sound good to me. It's O(1) like an array, and significantly more compact. As long as you never need to do range queries or the like, it should be fine.

# increment
p = (x, y)
counts[p] = counts.get(p, 0) + 1

# list
for (p, count) in counts.iteritems():
    x, y = p
    print x, y, count

scipy has a range of different sparse arrays

There are seven available sparse matrix types:
csc_matrix: Compressed Sparse Column format
csr_matrix: Compressed Sparse Row format
bsr_matrix: Block Sparse Row format
lil_matrix: List of Lists format
dok_matrix: Dictionary of Keys format
coo_matrix: COOrdinate format (aka IJV, triplet format)
dia_matrix: DIAgonal format

This should be similar to working with sparse matrices of the size of your data range, there's plenty of stuff to chew on here http://en.wikipedia.org/wiki/Sparse_matrix

继续阅读：data-structures python

Python data structure for sparse list of x/y co-ordinates

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？