Is there a way to keep lines from being skipped when using csv.dictWriter.writerow(somerow)
I am processing some files and want to create a log of what I am processing. I created the log by using a开发者_运维技巧 dictionary to hold the keys and values for each observation and then I am appending the dictionary to a list (a list of dictionaries).
To save the log I am using Python's csv module to write out the list of dictionaries. Initially I was using writerows but I encountered a problem in that very infrequently some of the values I am storing are something other than ascii
example
Investee\xe2\x80\x99s Share of Profits
my solution was to iterate through my list of dictionaries using try / except statements to skip over the problem dictionaries
for docnumb, item in enumerate(x[1]):
try:
dict_writer.writerow(item)
except UnicodeEncodeError:
missed.append(docnumb)
item
However, this leads to an extra row being inserted in each line of the output csv file.
value1 value2 value3 etc . . .
#blank row
value1 value2 value3 etc
I can't see how to suppress this behavior.
a little more code so there is more clarity about how I got here
import csv
keyset=set([])
for item in x[1]:
keyset |=set(item.keys())
keys=list(keyset)
logref=open(r'c:\December_2010_File_list.csv','w')
dict_writer=csv.DictWriter(logref,keys)
keyset |=set(item.keys())
See the documentation at http://docs.python.org/library/csv.html#csv-examples
They give a UnicodeWriter class as follows:
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
精彩评论