Python: Count lines and differentiate between them
I'm using an application that gives a timed output based on how man开发者_运维问答y times something is done in a minute, and I wish to manually take the output (copy paste) and have my program, and I wish to count how many times each minute it is done.
An example output is this:
13:48 An event happened.
13:48 Another event happened.
13:49 A new event happened.
13:49 A random event happened.
13:49 An event happened.
So, the program would need to understand that 2 things happened at 13:48, and 3 at 13:49. I'm not sure how the information would be stored, but I need to average them after, to determine an average of how often it happens. Sorry for being so complicated!
You could just use the time as a key for a dictionary and point it to a list of event messages. The length of that value would give you the number of events, while still letting you get at the specific events themselves:
>>> from pprint import pprint
>>> from collections import defaultdict
>>> events = defaultdict(list)
>>> with open('log.txt') as f:
... for line in f:
... time, message = line.strip().split(None, 1)
... events[time].append(message)
...
>>> pprint(dict(events)) # pprint handles defaultdicts poorly
{'13:48': ['An event happened.', 'Another event happened.'],
'13:49': ['A new event happened.',
'A random event happened.',
'An event happened.']}
If you want to be extra fancy, you could parse the time into a time object.
Edit: Take into account Mike Graham's suggestions.
If you just want a count of how many events happen each minute then you don't really need python, you can do it from bash:
cut -d ' ' -f1 filename | uniq -c
gives
2 13:48
3 13:49
If you don't need to know what happen but only how many times then:
$ python3.1 -c'from collections import Counter
import fileinput
c = Counter(line.split(None, 1)[0] for line in fileinput.input() if line.strip())
print(c)' events.txt
Output:
Counter({'13:49': 3, '13:48': 2})
You can also use a groupby
function from an itertools
module with time as a grouping key.
>>> import itertools
>>> from operator import itemgetter
>>> lines = (line.strip().split(None, 1) for line in open('log.txt'))
>>> for key, group in itertools.groupby(lines, key=itemgetter(0)):
... print '%s - %s' % (key, map(itemgetter(1), group))
...
13:48 - ['An event happened.', 'Another event happened.']
13:49 - ['A new event happened.', 'A random event happened.', 'An event happened.']
awk '{_[$1]++}END{for(i in _) print i,_[i]}' filename
精彩评论