开发者

weird behaviour while storing csv files

I get this weird issue while storing result开发者_运维技巧s in csv file.

summary = site.select('.//*[contains(@class, "summary")]/p/text()').extract()
description = ""
                    while (len(summary) != 0):

                        description = description + summary.pop(0).encode('utf-8')
                    description = str(description)

                    item['Description'] = description

So I am concerned with the Description column. I am extracting these results in csv format. If i open it in excel, the results are shown fine. But when i open it with wordpad, I can see that for few of the description data, the str has double quotes at the start and end of the string whereas for some cases, there are no double quotes.

Any idea why such strange behaviour


That is NOT weird behaviour, it is quite expected. Wordpad is a word processor, and will not try to interpret your .CSV file any more than it would a .TXT file. Each line in the file is treated just as a line of text. Excel however interprets commas (or semicolons, depending on the locale) as field delimiters. It also interprets "double quotes" as being a mechanism to prevent a field containing delimiters being chopped up. Example:

File:

Tom, Dick, and Harry,en
Zhang san, Li si,zh
"Tom, Dick, and Harry",en
"Zhang san, Li si",zh

As loaded by Excel:

A                      B      C           D
Tom                    Dick   and Harry   en
Zhang san              Li si  zh
Tom, Dick, and Harry   en
Zhang san, Li si       zh
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜