开发者

Why isn't the empty string being removed from list?

I'm trying to format a tab delimited txt file that has rows and columns. I'm trying to simply ignore the rows that have any empty values in it when I write to the output file. I'm doing this by len(list) method where if the length of the list equals the number of columns, then that line gets written to output file. But when I check the length of the lines, they are all the same, even though I removed the empty strings! Very frustrating...

Here's my code:

    import sys, os

    inputFileName = sys.argv[1]
    outputFileName = os.path.splitext(inputFileName)[0]+"_edited.txt"

    try:
       开发者_JAVA百科 infile = open(inputFileName,'r')
        outfile = open(outputFileName, 'w')
        line = infile.readline()
        outfile.write(line)
        for line in infile:
        lineList = line.split('\t')
        #print lineList
        if '' in lineList:
              lineList.remove('')
        #if len(lineList) < 9:
              #print len(lineList)

              #outfile.write(line)
        infile.close()
        #outfile.close()
    except IOError:
        print inputFileName, "does not exist."

Thanks for any help. When I create an experimental list in the interactive window and use the if '' in list: then it removes it. When I run the code, the ' ' is still there!


I dont know any python but i can mention you dont seem to be checking for whitespace characters. What about \r, \n on top of the \t's. Why dont you try trimming the line and checking if its == ''


I think that one of your problems is that list.remove only removes the first occurrence of the element. There could still be more empty strings in your list. From the documentation:

Remove the first item from the list whose value is x. It is an error if there is no such item.

To remove all the empty strings from your list you could use a list comprehension instead.

lineList = [x for x in lineList if x]

or filter with the identity function (by passing None as the first argument):

lineList = filter(None, lineList)


The following does what you're asking with fewer lines of code and removes empty lines of any kind of whitespace thanks to the strip() call.

#!/usr/bin/env python

import sys, os

inputFileName = sys.argv[1]
outputFileName = os.path.splitext(inputFileName)[0]+"_edited.txt"

try:
    infile = open(inputFileName,'r')
    outfile = open(outputFileName, 'w')

    for line in infile.readlines():
        if line.strip():
            outfile.write(line)

    infile.close()
    outfile.close()
except IOError:
    print inputFileName, "does not exist."

EDIT: For clarity, this reads each line of the input file then strips the line of leading and trailing whitespace (tabs, spaces, etc.) and writes the non-empty lines to the output file.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜