Python: Rearranging csv lines with conditions

2023-03-06 19:01 问答作者：

If I have few csv files as follows:

a,1,2,3
type, max, min, 开发者_Python百科avg
b,4,5,6
<empty line>
c,6,7,8
xxx,4,3,2
d,5,6,7

after reading the above csv file using csv reader, how do i dump the data to a new xls file using xlwt module but in an arranged manner, so that it: - writes the header as 1st line (this line always has 'type' as 1st item.) - ignores empty lines - ignores any lines which has 1st item as 'xxx'

I tried with the following code, but the empty line doesnt disappear. :(

for filename in glob.glob(p):
     (f_path, f_name) = os.path.split(filename)
     (f_short_name, f_extension) = os.path.splitext(f_name)
     ws = wb.add_sheet(str(f_short_name))
     spamReader = csv.reader(open(filename, 'rb'))

     for row in spamReader:
         pass_count = 0
         if 'type' in row[0]:
             for col in range(len(row)):
                 ws.write(0,col,convert(row[col]))
         else:
             if (((row[0] == 'xxx') or (row[0] == ' ')):
                 pass_count += 1
                 pass
             else:
                 for col in range(len(row)):
                     ws.write(row_count,col,convert(row[col]))

        row_count = row_count+1-pass_count


wb.save(q)

EDIT: Guys, apologies for misleading with the initial csv data. My csv data file doesnt contain any empty lines. It is the end product that contains the empty line, i.e. the final xls file. The empty line occurs exactly at the supposedly row[0] - 'type'. also, i have integrated a code which prevents line1 being overwritten.

e.g. input:

a,1,2,3
type, max, min, avg
b,4,5,6
c,6,7,8
xxx,4,3,2
d,5,6,7

output xls:

type, max, min, avg
<empty line>
a,1,2,3
b,4,5,6
c,6,7,8
d,5,6,7

Here, header is being written to 1st row, line with 'xxx' being ignored but, empty line occurs at row2 which is the location of line with row[0]='type' from the input file. This is reason, why i brought in `pass_count' to skip this line but seems not hitting the spot somewhere.

Your "empty" line is NOT empty.

If it were empty, row would be [] i.e. an empty list, and the first of all those references to row[0] would cause an exception to be raised. As that hasn't happened, and as you say it fails to match one space, the conclusion is that it must contain some other bunch of whitespace.

For robustness in the case of a truly empty line, you should test for empty row first:

if not row: continue
row = [x.strip() for x in row] # remove leading and trailing whitespace from all fields
if not row[0] or row[0] == "xxx": continue
if row[0] == "type":
    # code to write headings goes here
else:
    # code to write data goes here
row_count += 1 # pass_count is pointless

By the way, your example data has "type" in the second input line. This would overwrite the first row in the output file, and the 2nd output file row would be empty!

Update in response to revised information

Even if you don't have an empty line in this current file, it is very good practice to guard against completely empty lines, lines with only whitespace, and fields which consist only of whitespace, using code similar to what I suggested. A blank or empty line at the end of a csv file is not uncommon.

I should have mentioned that you seem to have an antique version of xlwt; later versions will raise an exception e.g. Attempt to overwrite cell: sheetname=u'x' rowx=0 colx=0. This was introduced to trap unintentional overwriting, and can be turned off on a sheet-by-sheet basis: workbook.add_sheet(u'Some Sheet Name', cell_overwrite_ok=True)

Your use of str() in ws = wb.add_sheet(str(f_short_name)) is definitely unnecessary and may cause an exception if the file name is already a unicode object.

If a line were empty, I don't think this would be true:

row[0] == ' '  # A space?

I would expect row to be an empty list, but perhaps there's something in your data that I don't know about.

Also, instead of sorting out the logic to keep track of row_count and then subtracting pass_count, why not keep things simple and just increment row_count whenever you write another Excel row? That would make pass_count unnecessary.

继续阅读：csv python xlwt

Python: Rearranging csv lines with conditions

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？