开发者

Any way to suppress/ignore specific types of errors when using BeautifulSoup

There are many elements that I need on each page I scrape, but many pages don't have all the items I need, so I end up having to wrap each and every item grab in

try:
    itemNeeded = soup.find(text="yada yada yada").next
except AttributeError:
    pass

This balloons my code by 400%.

Is there any way to abstr开发者_JAVA技巧act this away, or at least reduce the eyesore?

Edit: I'm not only searching for strings, but doing things like this as well:

navLinks = carSoup.find("span", "nav").findAll("a")
carDict['manufacturer'] = navLinks[1].next
carDict['model'] = navLinks[2].next


Build a list and iterate over the list... Use some templating.. You just need to figure out how to iterate over the whole page, in a smaller, simpler fashion.

text_list = ['items', 'to', 'search', 'for']
pre_find = {'items': (('span', 'nav'), 'a', ('manufacturer', 'model'))}
carDict = {}
for text in text_list:
    try:
        if pre_find.has_key(text):
            x = 1
            navLinks = carSoup.find(pre_find[text][0]).findAll(pre_find[text][1])
            for item in pre_find[text][2]:
                carDict[item] = navLinks[x].next
                x += 1
        else:
            carDict[text] = soup.find(text=text).next
    except AttributeError:
        pass


Have you considered writing a more global try except block, something like:

try:
    itemNeeded = soup.find(text="yada yada yada").next
    nextItem = soup.find(text = "blah blah blah").next
except AttributeError:
    pass
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜