Problem due to double quote while parsing csv.
I have csv file in the follwing format,
"1";"A";"A:"61 B & BA";"C"
Following is my code to read csv file,
with open(path, 'rb') as f:
reader = csv.reader(f, delimiter = ';', quotechar = '"')
for row in reader:
print row
The problem is, it breaks row in 5 fields,
['1', 'A', 'A:61 B &', ' BA', 'C']
Whereas I was expecting my output to be,
['1', 'A', 'A:61 B & BA', 'C']
When I remove double quote before 61 B in the csv file, I get output as,
['1', 'A', 'A:61 B & BA', 'C']
which is perfectly fine, but why 开发者_开发问答is double quote in the middle of the field is causing problem even though delimiter and quotechar has been defined?
Your csv file is invalid. If a quote occurs inside a (quoted) string, it must be escaped by doubling it.
"1";"A";"A:""61 B & BA";"C"
would result in
['1', 'A', 'A:"61 B & BA', 'C']
How should the CSV module guess the difference between quotes that delimit an item and quotes within the item?
I suspect the double-quote should be replaced by ".
you defined a delimiter that is in use in your text: the ampersand entity has a semicolon. I'd recommend changing your delimiter to something that will not show up in the text. (like a pipe character or something.)
精彩评论