开发者

The best way to search string in BeatifulSoup.findAll('a')

Guys please help me with开发者_StackOverflow社区 next problem. I need to find links with some key (string) and I used next code:

import  urllib2, re
from BeautifulSoup import BeautifulSoup 

url = 'http://5pd.ru'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup.findAll('a')
for link in soup.findAll('a'): 
    if '5' in link:
            print link

It doesn't return anything

But in this example:

site_list = ['http://extra1.ru/', 'http://5pd.ru/', 'http://google.ru/', 'http://fun.ru/']
for i in site_list:
    if '5' in i: 
        print i

It returned correct link

I just want to understand the most correct way to verify that link contain my string. Maybe I should make smth with soup.findAll('a')?


link is not string. use link['href'] instead of link inside for loop or force conversion to string with str(link)


findAll() with regular expression:

for link in soup.findAll('a', href=re.compile('5')):
    print link['href']
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜