Problem with Python CSV putting each letter in new field

2023-03-16 16:38 问答作者：

I'm trying to put a list of URLs into a csv file that I'm scraping from a webpage using urllib2 and BeautifulSoup. I have tried writing the links to a csv file as unicode and also converted to utf-8. In both cases, each letter is inserted into a new field.

Here's my code (I've tried it at least these two ways):

f = open('filename','wb')
w = csv.writer(f,delimiter=',')
for link in links:
    w.writerow(link['href'])

And:

f = open('filename','wb')
w = csv.writer(f,delimiter=',')
for link in links:
    w.writerow(link['href'].encode('utf-8'))

links is a list that looks like this:

[<a href="#Flyout1" accesskey="2" class="quicklinks" tabindex="1" title="Skip to content">Quick Links: Skip to main page content</a>, <a href="#search" class="quicklinks" tabindex="1" title="Skip to search">Skip to Search</a>, <a href="#News" class="quicklinks" tabindex="1" title="Skip to Section table of contents">Skip to Section Content Menu</a>, <a href="#footer" class="quicklinks" tabindex="1" title="Skip to site options">Skip to Common Links</a>, <a href="http://www.hhs.gov"><img src="/ucm/groups/fdagov-public/@system/documents/system/img_fdagov_hhs_gov.png" alt="www.hhs.gov link" style="width:112px; height:18px;" border="0" /></a>]

Not all the links have an 'href' key but I check for that in code not shown 开发者_如何学Gohere. In both cases, the correct strings are written to the csv file, but each letter is in a new field.

Any thoughts?

From the docs: "A row must be a sequence of strings or numbers ..." You are passing a single string, not a sequence of strings, so it treats each letter as an item. Put your string in a list.

So change w.writerow(link['href']) to w.writerow([link['href']]).

Note: A csv file with a single column looks exactly like a flat text file. Maybe you don't need csv.

I think by "each letter inserted into a new field" you mean something like this, right?

h,t,t,p,:,/,/,w,w,w,.,g,o,o,g,l,e,.,c,o,m

If so, then writerow() is iterating over the characters in your string, and interpreting those as distinct columns. Try using writerow([link['href']]) instead.

Edit: Looks like @Steven Rumbalski beat me to the punch on this!

According to the docs, writerow() takes an iterable object and, iterating over it, prints out the CSV representation of it. Your problem is a string is an iterable object. If I have:

mystring = 'foo'

Python will let me iterate over like so:

for c in mystring:
    print c

And I'll get:

f
o
o

That's a handy feature, but it's working against you in this case.

You don't want writerow() to itterate over the string, you want it to itterate over a list of strings -- separating the strings by commas, not the characters. In that case you'll want to make a list out of the strings like so:

w.writerow([link['href']])

继续阅读：csv python

Problem with Python CSV putting each letter in new field

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？