how to parse text
i have a question that i want to read a file, search for any line that has session ID (e.i. 12345), if matched then print all lines after that until encounter newline. after that, how can i assoc开发者_StackOverflow中文版iate all these lines to the session ID if i need to further parse these lines. And i want to do it in Python.
Thanks
this answers the first part of your question:
with open('myfile.txt') as f:
for line in f:
if '12456' in line:
print line
I didn't understand what else you were asking for. can you translate : "how can i associate all these lines to the session ID if i need to further parse these lines"?
I am going to assume that your log file is formatted like
session 321: abc de
567 89 abd ec
session 12345: ghi lm
763 98 dba ce
and that what you want to do is find the appropriate session and all following lines until you see a blank line.
import collections
import re
sessionData = collections.defaultdict(list)
lookfor = [12345, 13981]
newSession = re.compile(r'session (\d+):')
with open('my_log_file.txt', 'r') as inf:
session = None
for ln in inf:
ln = ln.rstrip()
if len(ln):
match = newSession.match(ln)
if match:
s = int(match.group(0))
if s in lookfor:
session = s
if session:
print ln
sessionData[session].append(ln)
else:
session = None
sessionData is now a session-keyed dict; for each session, it contains a list of all related lines. Using the above sample data, sessionData would look like
{ 12345: ["session 12345: ghi lm", " 763 98 dba ce"] }
精彩评论