Compound dictionary keys
I have a particular case where using compound dictionary keys would make a task easier. I have a working solution, but feel it is inelegant. How would you do it?
context = {
'database': {
'port': 9990,
'users': ['number2', 'dr_evil']
},
'admins': ['number2@virtucon.com', 'dr_evil@virtucon.com'],
'domain.name': 'virtucon.com'
}
def getitem(key, context):
if hasattr(key, 'upper') and key in context:
return context[key]
keys = key if hasattr(key, 'pop') else key.split('.')
k = keys.pop(0)
if keys:
try:
return getitem(keys, context[k])
except KeyError, e:
raise KeyError(key)
if hasattr(context, 'count'):
k = int(k)
return context[k]
开发者_如何学Pythonif __name__ == "__main__":
print getitem('database', context)
print getitem('database.port', context)
print getitem('database.users.0', context)
print getitem('admins', context)
print getitem('domain.name', context)
try:
getitem('database.nosuchkey', context)
except KeyError, e:
print "Error:", e
Thanks.
>>> def getitem(context, key):
try:
return context[key]
except KeyError:
pass
cur, _, rest = key.partition('.')
rest = int(rest) if rest.isdigit() else rest
return getitem(context[cur], rest)
>>> getitem(context, 'admins.0')
'number2@virtucon.com'
>>> getitem(context, 'database.users.0')
'number2'
>>> getitem(context, 'database.users.1')
'dr_evil'
I've changed the order of the arguments, because that's how most Python's functions work, cf. getattr
, operator.getitem
, etc.
The accepted solution (as well as my first attempt) failed due to the ambiguity inherent in the specs: '.'
may be "just a separator" or a part of the actual key string. Consider, for example, that key
may be 'a.b.c.d.e.f'
and the actual key to use at the current level is 'a.b.c.d'
with 'e.f'
left over for the next-most-indented level. Also, the spec is ambiguous in another sense: if more than one dot-joined prefix of 'key'
is present, which one to use?
Assume the intention is to try every such feasible prefix: this would possibly produce multiple solutions but we can arbitrarily return the first solution found in this case.
def getitem(key, context):
stk = [(key.split('.'), context)]
while stk:
kl, ctx = stk.pop()
if not kl: return ctx
if kl[0].isdigit():
ik = int(kl[0])
try: stk.append((kl[1:], ctx[ik]))
except LookupError: pass
for i in range(1, len(kl) + 1):
k = '.'.join(kl[:i])
if k in ctx: stk.append((kl[i:], ctx[k]))
raise KeyError(key)
I was originally trying to avoid all try/except
s (as well as recursion and introspection via hasattr
, isinstance
, etc), but one snuck back in: it's hard to check if an integer is an acceptable index/key into what might be either a dict or a list, without either some introspection to distinguish the cases, or (and it looks simpler here) a try/except
, so I went fir te latter, simplicity being always near the top of my concerns. Anyway...
I believe variants on this approach (where all the "possible continuation-context pairs" that might still be feasible at any point are kept around) are the only working way to deal with the ambiguities I've explained above (of course, one might choose to collect all possible solutions, arbitrarily pick one of them according to whatever heuristic criterion is desire, or maybe raise if the ambiguity is biting so there are multiple solutions, etc, etc, but these are minor variants of this general idea).
The following code works. It checks for the special case of a single key having a period in it. Then, it splits the key apart. For each subkey, it tries to fetch the value from a list-like context, then it tries from a dictionary-type context, then it gives up.
This code also shows how to use unittest/nose, which is highly recommended. Test with "nosetests mysource.py".
Lastly, consder using Python's built-in ConfigParser class, which is really useful for this type of configuration task: http://docs.python.org/library/configparser.html
#!/usr/bin/env python
from nose.tools import eq_, raises
context = {
'database': {
'port': 9990,
'users': ['number2', 'dr_evil']
},
'admins': ['number2@virtucon.com', 'dr_evil@virtucon.com'],
'domain.name': 'virtucon.com'
}
def getitem(key, context):
if isinstance(context, dict) and context.has_key(key):
return context[key]
for key in key.split('.'):
try:
context = context[int(key)]
continue
except ValueError:
pass
if isinstance(context, dict) and context.has_key(key):
context = context[key]
continue
raise KeyError, key
return context
def test_getitem():
eq_( getitem('database', context), {'port': 9990, 'users': ['number2', 'dr_evil']} )
eq_( getitem('database.port', context), 9990 )
eq_( getitem('database.users.0', context), 'number2' )
eq_( getitem('admins', context), ['number2@virtucon.com', 'dr_evil@virtucon.com'] )
eq_( getitem('domain.name', context), 'virtucon.com' )
@raises(KeyError)
def test_getitem_error():
getitem('database.nosuchkey', context)
As the key to getitem
must be a string (or a list which is passed in the recursive call) I've come up with the following:
def getitem(key, context, first=True):
if not isinstance(key, basestring) and not isinstance(key, list) and first:
raise TypeError("Compound key must be a string.")
if isinstance(key, basestring):
if key in context:
return context[key]
else:
keys = key.split('.')
else:
keys = key
k = keys.pop(0)
if key:
try:
return getitem(keys, context[k], False)
except KeyError, e:
raise KeyError(key)
# is it a sequence type
if hasattr(context, '__getitem__') and not hasattr(context, 'keys'):
# then the index must be an integer
k = int(k)
return context[k]
I am on the fence as to whether this is an improvement.
I'm leaving my original solution for posterity:
CONTEXT = {
"database": {
"port": 9990,
"users": ["number2", "dr_evil"]},
"admins": ["number2@virtucon.com", "dr_evil@virtucon.com"],
"domain": {"name": "virtucon.com"}}
def getitem(context, *keys):
node = context
for key in keys:
node = node[key]
return node
if __name__ == "__main__":
print getitem(CONTEXT, "database")
print getitem(CONTEXT, "database", "port")
print getitem(CONTEXT, "database", "users", 0)
print getitem(CONTEXT, "admins")
print getitem(CONTEXT, "domain", "name")
try:
getitem(CONTEXT, "database", "nosuchkey")
except KeyError, e:
print "Error:", e
But here's a version that implements an approach similar to the getitem interface suggested by doublep. I am specifically not handling dotted keys, but rather forcing the keys into separate nested structures because that seems cleaner to me:
CONTEXT = {
"database": {
"port": 9990,
"users": ["number2", "dr_evil"]},
"admins": ["number2@virtucon.com", "dr_evil@virtucon.com"],
"domain": {"name": "virtucon.com"}}
if __name__ == "__main__":
print CONTEXT["database"]
print CONTEXT["database"]["port"]
print CONTEXT["database"]["users"][0]
print CONTEXT["admins"]
print CONTEXT["domain"]["name"]
try:
CONTEXT["database"]["nosuchkey"]
except KeyError, e:
print "Error:", e
You might notice that what I've really done here is eliminate all ceremony regarding accessing the data structure. The output of this script is the same as the original except that it does not contain a dotted key. This seems like a more natural approach to me but if you really wanted to be able to handle dotted keys, you could do something like this I suppose:
CONTEXT = {
"database": {
"port": 9990,
"users": ["number2", "dr_evil"]},
"admins": ["number2@virtucon.com", "dr_evil@virtucon.com"],
"domain": {"name": "virtucon.com"}}
def getitem(context, dotted_key):
keys = dotted_key.split(".")
value = context
for key in keys:
try:
value = value[key]
except TypeError:
value = value[int(key)]
return value
if __name__ == "__main__":
print getitem(CONTEXT, "database")
print getitem(CONTEXT, "database.port")
print getitem(CONTEXT, "database.users.0")
print getitem(CONTEXT, "admins")
print getitem(CONTEXT, "domain.name")
try:
CONTEXT["database.nosuchkey"]
except KeyError, e:
print "Error:", e
I'm not sure what the advantage of this type of approach would be though.
精彩评论