faster parsing a file for all list elements and apending into new files based on the list elements
I am trying to parse a log file with threadids in every like. There could be any number of threads that can be configured. All threads write to the same log file and I am parsing the log file and creating new files specific for each thread in order to check them later.
Below I am capturing the threadids in a list. The below code is doing the job but I feel this is not efficient. Can there be anything faster ?.sThdiD = ["abc", "cde\"efg"]
folderpath = "newdir"
os.开发者_开发知识库system("mkdir " + folderpath)
for line in open(filetoopen):
for i in sThdiD:
if i in line:
open(folderpath+"/"+i+".log","a+").write(line)
Assuming you can fit the whole log file into memory, I'd keep a dictionary mapping thread IDs to lines written by that thread, and then write out whole files at the end.
thread_map = {} # keys are thread IDs; values are lists of log lines
for line in open(filetoopen):
for i in sThdiD:
if i in line:
if i not in thread_map:
thread_map[i] = []
thread_map[i].append(line)
for key in thread_map:
f = open(folderpath+"/"+key+".log", "w")
for line in thread_map[key]:
f.write(line)
f.close()
If you can't keep the whole log file in memory, try a multipass solution in which you write each file one at a time.
in_file = open(filetoopen)
for i in sThdiD:
in_file.seek(0) # Reset the file to read from the beginning.
out_file = open(folderpath+"/"+i+".log", "w")
for line in in_file:
if i in line:
out_file.write(line)
out_file.close()
in_file.close()
精彩评论