Once again, how to get nested loops to work in python
Can someone help me with this nested loop? it has the same problem as Loops not working - Strings (Python) but now it is in a csv class that doesn't have a csv.readline() function.
import csv
import sys, re
import codecs
reload(sys)
sys.setdefaultencoding('utf-8')
reader = csv.reader(open("reference.txt"), delimiter = "\t")
reader2 = csv.reader(open("current.txt"), delimiter = "\t")
for line in reader:
for line2 in reader2:
if line[0] == line2[1]:
print line2[0] + '\t' + line[0]
print line[1]
else:
print line[0]
print line[1]
The purpose of this code is to check the lines in a reference text (i.e. reader2) that coincide with the开发者_Python百科 current textfile (i.e. reader). And then print the serial number that is in the reference.txt
reference.txt looks like this (the space between the serial no. and sentence is a tab
):
S00001LP this is a nested problem
S00002LP that cannot be solved
S00003LP and it's pissing me off
S00004LP badly
current.txt looks like this(the space between the 1st and 2nd sentence is a ):
this is a nested problem wakaraa pii ney bay tam
and i really can't solve it shuu ipp faa luiip
so i come to seek help from stackoverflow lakjsd sdiiije
seriously it is crazy because such foo bar bar foo
problems don't happen in other languages whaloemver ahjd
and it's pissing me off gaga ooo mama
badly wahahahah
the required output will look something like this:
S00001LP this is a nested problem wakaraa pii ney bay tam
and i really can't solve it shuu ipp faa luiip
so i come to seek help from stackoverflow lakjsd sdiiije
seriously it is crazy because such foo bar bar foo
problems don't happen in other languages whaloemver ahjd
S00003LP and it's pissing me off gaga ooo mama
S00004LP badly wahahahah
You can only read from a stream once. Your inner loop is consuming the second file too quickly, and other iterations of your outer loop don't have a chance to read the second file again.
Try changing this:
reader = csv.reader(open("reference.txt"), delimiter = "\t")
reader2 = csv.reader(open("current.txt"), delimiter = "\t")
to this:
reader = list(csv.reader(open("reference.txt"), delimiter = "\t"))
reader2 = list(csv.reader(open("current.txt"), delimiter = "\t"))
The list()
will read the file in its entirety, creating an in-memory list from it, which you can then iterate as many times as your like.
A better solution would be to store your reference data in a dictionary so that you don't have to loop over it for every line in your data.
One approach is to create a dictionary mapping your keys to serial numbers:
serials = dict(map(reversed, reader))
for line in reader2:
serial = serials.get(line[0])
if serial is not None:
print serial
This will be much faster than a nested loop.
The first line creates the dictionary mapping keys to serial numbers. Since the dictionary constructor expects an iterables of (key, value) pairs while your file actually contains (value, key) pairs, we have to swap the two entries in each record -- this is what map(reversed, ...)
does.
精彩评论