开发者

Multi-level defaultdict with variable depth?

I have a large list like:

[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4

I want to build a multi-level dict like:

A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4

I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B开发者_Python百科2]=2, but this works only if I hardcode those insert statement.

While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].


You can do it without even defining a class:

from collections import defaultdict

nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()

nest[0][1][2][3][4][5] = 6


Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:

from collections import defaultdict
class Tree(defaultdict):
    def __init__(self, value=None):
        super(Tree, self).__init__(Tree)
        self.value = value

root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value

Outputs:

1
3
None

You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.


I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.

# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)

Usage:

# Create instance
mydict = tree()

mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0

# Print
import prettyprint
prettyprint.pp(mydict)

Output:

{
  "a": 1, 
  "b": {
    "a": 1
  }, 
  "c": {},
  "d": {
    "a": {
      "b": 0
    }
  }
}


I'd do it with a subclass of dict that defines __missing__:

>>> class NestedDict(dict):
...     def __missing__(self, key):
...             self[key] = NestedDict()
...             return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}

You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.


This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?

def dict_factory():
   return defaultdict(dict_factory)

your_dict = dict_factory()

Also -- from the comments -- if you'd like to update from an existing dict, you can simply call

your_dict[0][1][2].update({"some_key":"some_value"})

In order to add values to the dict.


Dan O'Huiginn posted a very nice solution on his journal in 2010:

http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html

>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())


>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}


You may achieve this with a recursive defaultdict.

from collections import defaultdict

def tree():
    def the_tree():
        return defaultdict(the_tree)
    return the_tree()

It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.

The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:

>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined


To add to @Hugo
To have a max depth:

l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)


A slightly different possibility that allows regular dictionary initialization:

from collections import defaultdict

def superdict(arg=()):
    update = lambda obj, arg: obj.update(arg) or obj
    return update(defaultdict(superdict), arg)

Example:

>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2


You could use a NestedDict.

from ndicts.ndicts import NestedDict

nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6

The result as a dictionary:

>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}

To install ndicts

pip install ndicts
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜