开发者

Python CGI Script (using XML & mindom) cannot extract null data

This portion of code parses XML for output to the screen on a webpage.

for counter in range(100):
    try:
        for item in BlekkoSearchResultsXML.getElementsByTagName('item'):
            Blekko_PageTitle = item.getElementsByTagName('title')[counter].firstChild.toxml(encoding="utf-8")
            Blekko_PageDesc = item.getElementsByTagName('description')[counter].firstChild.toxml(encoding="utf-8")
            Blekko_DisplayURL = item.getElementsByTagName('guid')[counter].firstChild.toxml(encoding="utf-8")
            Blekko_URL = item.getElementsByTagName('link')[counter].firstChild.toxml(encoding="utf-8")
            print "<h2>" + Blekko_PageTitle + "</h2>"
            print Blekko_PageDesc + "<br />"
            print Blekko_DisplayURL + "<br />"
            print Blekko_URL + "<br />"
    except IndexError:
        break

However, the script fails if it encounte开发者_运维问答rs a set of null XML tags, i.e. if a page had no page title or description, with the error message:

AttributeError: 'NoneType' object has no attribute 'toxml' 
      args = ("'NoneType' object has no attribute 'toxml'",)

Snippet of XML being parsed:

<item>
        <title>SUSHI FANLISTING</title>
        <link>http://sushi.perfectdrug.net/</link>
        <guid>http://sushi.perfectdrug.net/</guid>
        <description>This is the official...</description>
        </item>

I have unsuccessfully tried using a try/except statement like this:

try:
    Blekko_PageTitle = item.getElementsByTagName('title')[counter].firstChild.toxml(encoding="utf-8")
except Blekko_PageTitle = None:
    Blekko_PageTitle = "No page title provided..."

Any suggestions appreciated.


You're doing except wrong: it catches Exception objects that get raised. You want

except AttributeError:

Alternatively, use a conditional:

if Blekko_PageTitle = None:
    ...
else:
    ...
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜