开发者

Problem with nested for loops

I have to read two csv file, combine the row and write the result in a third csv file. first csv file have five row with user name in the first colunm.( 25 colunm in total) second csv file have five row with user name in the first colunm and user id in second colunm.(only 2 colunm)

the third csv file will contain username+useridand all remaining 24 column of first file.

data = open(os.path.join("c:\\transales","AccountID+ContactID-source1.csv"),"rb").read().replace(";",",").replace("\0","")
data2 = open(os.path.join("c:\\transales","AccountID+ContactID-source2.csv"),"rb").read().replace(";",",").replace("\0","")

i = 0
j = 0
Info_Client_source1=StringIO.StringIO(data)
Info_Client_source2=StringIO.StringIO(data2)


for line in csv.reader(Info_Client_source1):
    name= line[1]
    i=开发者_StackOverflow社区i+1
    print "i= ",i
    for line2 in csv.reader(Info_Client_source2):
        print "j = :",j
        j=j+1
        if line[1] == line2[2]:
            continue

the result:

i=  1
j = : 0
j = : 1
j = : 2
j = : 3
j = : 4
j = : 5
j = : 6
i=  2
i=  3
i=  4
i=  5
i=  6
i=  7

why after i=2 the seconf for loop do nothing ?? I expect to have i=2, j=0 to 6, i=3 j=0 ro 6 ,...


It's because you read off the entire contents of your StringIO object in the first pass, leaving the cursor at the end of the string. On the second pass, there's nothing left to read, so you end up with an empty reader.

Also, it's probably not a great idea to call csv.reader() for every inner iteration of your loop. Let me rephrase your code and then explain my changes:

data = open(os.path.join("c:\\transales","AccountID+ContactID-source1.csv"),"rb").read().replace(";",",").replace("\0","")
data2 = open(os.path.join("c:\\transales","AccountID+ContactID-source2.csv"),"rb").read().replace(";",",").replace("\0","")

source1 = csv.reader(data)
source2 = csv.reader(data2)

for line in source1:
    name= line[1]
    i=i+1
    print "i= ",i
    data2.seek(0)
    for line2 in source2:
        print "j = :",j
        j=j+1
        if line[1] == line2[2]:
            continue

Changes:

  • I've removed the extraneous step of creating a StringIO object; you can just pass a standard file handle to csv.reader() and it'll work fine. (If there's a reason for creating those StringIO objects, feel free to add that back in...)
  • I've moved the initialization of the readers outside the for loop. While it'd be alright for source1 to be initialized in the outer loop, having source2 initialized in the inner loop is pretty inefficient.
  • Most importantly, calling data2.seek(0) resets the cursor on the underlying file handle, which will allow you to read from data2 repeatedly.

Here's a similar question on SO, which might better illustrate the idea:

StackOverflow: Reading from CSVs in Python repeatedly?

Hope it helps. :)


Because once the csv.reader reaches the end of the file, it will never execute any more.

For a small dataset like this you can easily read your data into a list or dictionary and iterate over that.


More pythonic would be:

filename1 = os.path.join('c:\\transales', 'AccountID+ContactID-source1.csv') 
filename2 = os.path.join('c:\\transales', 'AccountID+ContactID-source2.csv') 

with open(filename1, 'rb') as file1, open(filename2, 'rb') as file2:

    csv1 = csv.reader(file1, delimiter=';')
    csv2 = csv.reader(file2, delimiter=';')

    lookup = { line[0] : line[1:] for line in csv1 }
    joined = [ [uname, uid] + lookup[uname] for (uname, uid) in csv2 ]

print joined

(assuming Python version 2.7)

BTW: first column has index 0, not 1.


It could be an easy fix...try moving your declaration for Info_Client_source2 into the first loop:

Info_Client_source1=StringIO.StringIO(data)


for line in csv.reader(Info_Client_source1):
    Info_Client_source2=StringIO.StringIO(data2)
    name= line[1]
    i=i+1
    print "i= ",i
    for line2 in csv.reader(Info_Client_source2):
        print "j = :",j
        j=j+1
        if line[1] == line2[2]:
            continue
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜