开发者

Look for img and id tag, store URL in variable if both are true

I have been playing around with some Python 开发者_Go百科now and start to get a hang of it.

I have already came up with a project, but I can't work out some things.

The purpose is to look inside a defined tag, such as the img tag or the a tag.

If that's true, it also needs to look for a id tag, always the same one.

If the img take looks like <img src="/overflow.png" id="true"> I want it to be stored

If the img take looks like <img src="/overflow.png" id="false"> I don't want it stored.

Hope this is fairly easy to achieve, I just haven't found a solution yet. I have looked up the function for HTMLParser, but it's more gibberish than sense to me. Hope someone knows how to do this, and help me out. Will be much appreciated!

Cheers,

ninjaboi21.


People generally use BeautifulSoup, http://www.crummy.com/software/BeautifulSoup/, to do this sort of thing.

After installing:

from BeautifulSoup import BeautifulSoup
# if the file is on your computer use this
#file = open('/path/to/the/file')
# and if the file is on the internet use this
#import urllib
#file = urllib.urlopen('http://www.the.com/path/to/the/file')
html = file.read()
file.close()
soup = BeautifulSoup(html)
trueimages = [image for image in soup.findAll('img') if image['id'].lower() == 'true']

Edit: added how to get the file into the string.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜