开发者

How can I optimize this code?

I'm developing a logger daemon to squid to grab the logs on a mongodb database. But I'm experiencing too much cpu utilization. How can I optimize this code?


from sys import stdin

from pymongo import Connection

connection = Connection()
db = connection.squid
logs = db.logs
buffer = []
a = 'timestamp'
b = 'resp_time'
c = 'src_ip'
d = 'cache_status'
e = 'reply_size'
f = 'req_method'
g = 'req_url'
h = 'username'
i = 'dst_ip'
j = 'mime_type'
L = 'L'

while True:
    l = stdin.readline()
    if l[0] == L:
        l = l[1:].split()
        buffer.append({
            a: float(l[0]),
            b: int(l[1]),
            c: l[2],
            d: l[3],
            e: int(l[4]),
            f: l[5],
            g: l[6],
            h: l[7],
            i: l[8],
            j: l[9]
            }
        )
    if len(buffer开发者_StackOverflow中文版) == 1000:
        logs.insert(buffer)
        buffer = []

    if not l:
        break

connection.disconnect()


This might be a better question for a python profiler. There's a few builtin Python profiling modules such as cProfile; you can read more about it here.


I'd suspect it might actually be readline() causing cpu utilization. Try running the same code with the readline replaced with just looking at some constant buffer provided by you. And try running with the database inserts commented out. Establish which one of these is the culprit.


The cpu usage is given by that active loop While True. How many lines / minute do you have? put the

if len(buffer) == 1000:    
    logs.insert(buffer)
    buffer = []

check after the buffer.append

I will tell you more after you tell me how many insertions you get so far

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜