Any way to suppress/ignore specific types of errors when using BeautifulSoup
There are many elements that I need on each page I scrape, but many pages don't have all the items I need, so I end up having to wrap each and every item grab in
try:
itemNeeded = soup.find(text="yada yada yada").next
except AttributeError:
pass
This balloons my code by 400%.
Is there any way to abstr开发者_JAVA技巧act this away, or at least reduce the eyesore?Edit: I'm not only searching for strings, but doing things like this as well:
navLinks = carSoup.find("span", "nav").findAll("a")
carDict['manufacturer'] = navLinks[1].next
carDict['model'] = navLinks[2].next
Build a list and iterate over the list... Use some templating.. You just need to figure out how to iterate over the whole page, in a smaller, simpler fashion.
text_list = ['items', 'to', 'search', 'for']
pre_find = {'items': (('span', 'nav'), 'a', ('manufacturer', 'model'))}
carDict = {}
for text in text_list:
try:
if pre_find.has_key(text):
x = 1
navLinks = carSoup.find(pre_find[text][0]).findAll(pre_find[text][1])
for item in pre_find[text][2]:
carDict[item] = navLinks[x].next
x += 1
else:
carDict[text] = soup.find(text=text).next
except AttributeError:
pass
Have you considered writing a more global try except block, something like:
try:
itemNeeded = soup.find(text="yada yada yada").next
nextItem = soup.find(text = "blah blah blah").next
except AttributeError:
pass
精彩评论