Python: Count lines and differentiate between them

2022-12-24 09:31 问答作者：

I'm using an application that gives a timed output based on how man开发者_运维问答y times something is done in a minute, and I wish to manually take the output (copy paste) and have my program, and I wish to count how many times each minute it is done.

An example output is this:

13:48 An event happened.
13:48 Another event happened.
13:49 A new event happened.
13:49 A random event happened.
13:49 An event happened.

So, the program would need to understand that 2 things happened at 13:48, and 3 at 13:49. I'm not sure how the information would be stored, but I need to average them after, to determine an average of how often it happens. Sorry for being so complicated!

You could just use the time as a key for a dictionary and point it to a list of event messages. The length of that value would give you the number of events, while still letting you get at the specific events themselves:

>>> from pprint import pprint
>>> from collections import defaultdict
>>> events = defaultdict(list)
>>> with open('log.txt') as f:
...     for line in f:
...         time, message = line.strip().split(None, 1)
...         events[time].append(message)
... 
>>> pprint(dict(events)) # pprint handles defaultdicts poorly
{'13:48': ['An event happened.', 'Another event happened.'],
 '13:49': ['A new event happened.',
           'A random event happened.',
           'An event happened.']}

If you want to be extra fancy, you could parse the time into a time object.

Edit: Take into account Mike Graham's suggestions.

If you just want a count of how many events happen each minute then you don't really need python, you can do it from bash:

 cut -d ' ' -f1 filename | uniq -c

gives

  2 13:48
  3 13:49

If you don't need to know what happen but only how many times then:

$ python3.1 -c'from collections import Counter
import fileinput
c = Counter(line.split(None, 1)[0] for line in fileinput.input() if line.strip())
print(c)' events.txt

Output:

Counter({'13:49': 3, '13:48': 2})

You can also use a groupby function from an itertools module with time as a grouping key.

>>> import itertools
>>> from operator import itemgetter
>>> lines = (line.strip().split(None, 1) for line in open('log.txt'))
>>> for key, group in itertools.groupby(lines, key=itemgetter(0)):
...     print '%s - %s' % (key, map(itemgetter(1), group))
... 
13:48 - ['An event happened.', 'Another event happened.']
13:49 - ['A new event happened.', 'A random event happened.', 'An event happened.']

awk '{_[$1]++}END{for(i in _) print i,_[i]}' filename

继续阅读：count python

Python: Count lines and differentiate between them

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？