Python Most Efficient Way to Search a List
Bear with me as I am very new to Python. Basically I am looking for the most efficient way to search through a multi-dimensional list. So say I have the following list:
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
And I wanted the outcome of my function to produce: Apple: 15, Banana: 15, Pear 2. What would be the most efficient way开发者_运维百科 to do this?
That is not in any way a search...
What you want is
import collections
def count(items):
data = collections.defaultdict(int)
for kind, count in items:
data[kind] += count
return data
fruit = [['banana', 6], ['apple',5], ['banana',9],['apple',10],['pear',2]]
f = {}
def fruit_count():
for x in fruit:
if x[0] not in f.keys():
f.update({x[0]:x[1]})
else:
t = f.get(x[0])
t = t + x[1]
f.update({x[0]:t})
return f
f = {'apple': 15, 'banana': 15, 'pear': 2}
Use a collections.defaultdict
to accumulate, and iterate through the list.
accum = collections.defaultdict(int)
for e in fruit:
accum[e[0]] += e[1]
myHash = {}
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
for i in fruit:
if not i[0] in myHash.keys():
myHash[i[0]] = 0
myHash[i[0]] += i[1]
for i in myHash:
print i, myHash[i]
would return
apple 15
banana 15
pear 2
Edit
I didn't know about defaultdict in python. That is a much better way.
I'm unsure what type apple
and banana
are, so I made just them empty classes and used their class names for identification. One approach to this problem is to use the dictionary method setdefault()
which first checks to see if a given key is already in the dictionary and if it is simply returns it, but if it's not, will insert it it with a default value before returning that.
To make more efficient use of it for this problem by avoiding multiple dictionary key lookups, the count associated with each key needs be stored in something "mutable" or changeable since simple integers are not in Python. The trick is to store the numeric count in a one-element list
which can be changed. The first function in code below shows how this can be done.
Note that the Python collections
module in the standard library has had a dictionary subclass in it called defaultdict
which could have been used instead which effectively does the setdefault()
operation for you whenever a non-existent key is first accessed. It also makes storing the count in a list for efficiency unnecessary and updating it a slightly simpler.
In Python 2.7 another dictionary subclass was added to the collections
module called counter
. Using it probably would be the best solution since it was designed for exactly this kind of application. The code below shows how to do it all three ways (and sorts the list of totals created).
class apple: pass
class banana: pass
class pear: pass
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
# ---- using regular dictionary
def tally(items):
totals = dict()
for kind, count in items:
totals.setdefault(kind, [0])[0] += count
return sorted([key.__name__,total[0]] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]
import collections
# ---- using collections.defaultdict dict subclass
def tally(items):
totals = collections.defaultdict(int) # requires Python 2.5+
for kind, count in items:
totals[kind] += count
return sorted([key.__name__, total] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]
# ---- using collections.Counter dict subclass
def tally(items):
totals = collections.Counter() # requires Python 2.7+
for kind, count in items:
totals[kind] += count
return sorted([key.__name__, total] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]
精彩评论