Beautiful Soup - handling errors
I'd like to know how to handle a situation when
href
doesn't exist after the<strong>Text:</strong>
Is there a better way to search for the content that exis开发者_JAVA百科ts after
<strong>Contact:</strong>
http://pastebin.com/FYMxTJkf
How about findNext?
import re
from BeautifulSoup import BeautifulSoup
html = '''<strong>Text:</strong>
<a href='http://domain.com'>url</a>'''
soup = BeautifulSoup(html)
label = soup.find("strong" , text='Text:')
contact = label.findNext('a')
if contact.get('href') != None:
print contact
else:
print "No href"
If you're looking specifically for an a
tag with an href
, use:
contact = label.findNext('a', attrs={'href' : True})
With this you won't need to condense whitespace. I imagine you did this because next
was returning the whitespace after the label.
精彩评论