开发者

Most efficient way to add new keys or append to old keys in a dictionary during iteration in Python?

Here's a common situation when compiling data in dictionaries from different sources:

Say you have a dictionary that stores lists of things, such as things I like:

likes = {
    'colors': ['blue','red','purple'],
    'foods': ['apples', 'oranges']
}

and a second dictionary with some related values in it:

favorites = {
    'colors':'yellow',
    'desserts':'ice cream'
}

You then want to iterate over the "favorites" object and either append the items in that object to the list w开发者_如何学Pythonith the appropriate key in the "likes" dictionary or add a new key to it with the value being a list containing the value in "favorites".

There are several ways to do this:

for key in favorites:
    if key in likes:
        likes[key].append(favorites[key])
    else:
        likes[key] = list(favorites[key])

or

for key in favorites:
    try:
        likes[key].append(favorites[key])
    except KeyError:
        likes[key] = list(favorites[key])

And many more as well...

I generally use the first syntax because it feels more pythonic, but if there are other, better ways, I'd love to know what they are. Thanks!


Use collections.defaultdict, where the default value is a new list instance.

>>> import collections
>>> mydict = collections.defaultdict(list)

In this way calling .append(...) will always succeed, because in case of a non-existing key append will be called on a fresh empty list.

You can instantiate the defaultdict with a previously generated list, in case you get the dict likes from another source, like so:

>>> mydict = collections.defaultdict(list, likes)

Note that using list as the default_factory attribute of a defaultdict is also discussed as an example in the documentation.


Use collections.defaultdict:

import collections

likes = collections.defaultdict(list)

for key, value in favorites.items():
    likes[key].append(value)

defaultdict takes a single argument, a factory for creating values for unknown keys on demand. list is a such a function, it creates empty lists.

And iterating over .items() will save you from using the key to get the value.


Except defaultdict, the regular dict offers one possibility (that might look a bit strange): dict.setdefault(k[, d]):

for key, val in favorites.iteritems():
    likes.setdefault(key, []).append(val)

Thank you for the +20 in rep -- I went from 1989 to 2009 in 30 seconds. Let's remember it is 20 years since the Wall fell in Europe..


>>> from collections import defaultdict
>>> d = defaultdict(list, likes)
>>> d
defaultdict(<class 'list'>, {'colors': ['blue', 'red', 'purple'], 'foods': ['apples', 'oranges']})
>>> for i, j in favorites.items():
    d[i].append(j)

>>> d
defaultdict(<class 'list'>, {'desserts': ['ice cream'], 'colors': ['blue', 'red', 'purple', 'yellow'], 'foods': ['apples', 'oranges']})


All of the answers are defaultdict, but I'm not sure that's the best way to go about it. Giving out defaultdict to code that expects a dict can be bad. (See: How do I make a defaultdict safe for unexpecting clients? ) I'm personally torn on the matter. (I actually found this question looking for an answer to "which is better, dict.get() or defaultdict") Someone in the other thread said that you don't want a defaultdict if you don't want this behavior all the time, and that might be true. Maybe using defaultdict for the convenience is the wrong way to go about it. I think there are two needs being conflated here:

"I want a dict whose default values are empty lists." to which defaultdict(list) is the correct solution.

and

"I want to append to the list at this key if it exists and create a list if it does not exist." to which my_dict.get('foo', []) with append() is the answer.

What do you guys think?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜