Summarizing a dictionary of arrays in Python

2023-01-10 02:54 问答作者：

I got the following dictionary:

mydict = {
  'foo': [1,19,2,3,24,52,2,6],          # sum: 109
  'bar': [50,5,9,7,66,3,2,44],          # sum: 186
  'another': [1,2,3,4,5,6,7,8],         # sum:  36
  'entry': [0,0,0,2,99,4,33,55],        # sum: 193
  'onemore': [21,22,23,24,25,26,27,28]  # sum: 196
}

I need to efficiently filter out and sort the top x entries by the sum of the array.

For example, the Top 3 sorted and filtered list for the example above would be

sorted_filtered_dict = {
  'onemore': [21,22,23,24,25,26,27,28], # sum: 196
  '开发者_StackOverflow中文版entry': [0,0,0,2,99,4,33,55],        # sum: 193
  'bar': [50,5,9,7,66,3,2,44]           # sum: 186
}

I'm fairly new to Python, and tried it myself with chaining a sum and filter function on a lambda function, but struggled with the actual syntax.

It's easy to do with a sort:

sorted(mydict.iteritems(), key=lambda tup: sum(tup[1]), reverse=True)[:3]

This is reasonable if the ratio is similar to this one (3 / 5). If it's larger, you'll want to avoid the sort (O(n log n)), since top 3 can be done in O(n). For instance, using heapq, the heap module:

heapq.nlargest(3, mydict.iteritems(), key=lambda tup: sum(tup[1]))

This is O(n + 3 log n), since assembly the initial heap is O(n) and re-heapifying is O(log n).

EDIT: If you're using Python 2.7 or later, you can easily convert to a OrderedDict (equivalent version for Python 2.4+):

OrderedDict(heapq.nlargest(3, mydict.iteritems(), key=lambda tup: sum(tup[1])))

OrderedDict has the same API as dict, but remembers insertion order.

For such a small slice it's not worth using islice

sorted(mydict.iteritems(), key=lambda (k,v): sum(v), reverse=True)[:3]

继续阅读：algorithm arrays python

Summarizing a dictionary of arrays in Python

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？