开发者

Search for a specific value in a specific column with Python

I've got a text file that is tab delimited and I'm trying to figure out how to search for a value in a specific column in this file.

I think i need to use the csv import but have been unsuccessful so far. Can someone point me in the right direction?

Thanks!

**Update** Thanks for everyone's updates. I know I could probably use awk for this but simply for practice, I am trying to finish it in python.

I am getting the following error now: if row.split(' ')[int(searchcolumn)] == searchquery: IndexError: list index out of range

And here is the snippet of my code:

#open the directory and find all the files
for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        f=open(file, 'r')
        lines=f.readlines()
        for line in lines:
            #the first 4 lines of the file are crap, skip them
            if linescounter > startfromline:
                with open(file) as infile:
                    for row in infile:
                        if row.split(' ')[int(searchcolumn)] == searchquery:
                           开发者_C百科 rfile = open(resultsfile, 'a')
                            rfile.writelines(line) 
                            rfile.write("\r\n")
                            print "Writing line -> " + line
                            resultscounter += 1
        linescounter += 1
        f.close()

I am taking both searchcolumn and searchquery as raw_input from the user. Im guessing the reason I am getting the list out of range now, is because it's not parsing the file correctly?

Thanks again.


You can also use the sniffer (example taken from http://docs.python.org/library/csv.html)

csvfile = open("example.csv", "rb")
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)


Yes, you'll want to use the csv module, and you'll want to set delimiter to '\t':

spamReader = csv.reader(open('spam.csv', 'rb'), delimiter='\t')

After that you should be able to iterate:

for row in spamReader:
   print row[n]


This prints all rows in filename with 'myvalue' in the fourth tab-delimited column:

with open(filename) as infile:
    for row in infile:
        if row.split('\t')[3] == 'myvalue':
            print row

Replace 3, 'myvalue', and print as appropriate.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜