How do I get a list of all parent tags in BeautifulSoup?

2023-01-16 17:04 问答作者：

Let's say I have a structure like this:

<folder name="folder1">
     <folder name="folder2">
          <bookmark href="link.html">
     </folder>
</folder>

If I point to bookmark, what would be the command to just extract all of the folder lines? For example,

bookmarks = soup.findAll('bookmark')
开发者_如何学Python

then beautifulsoupcommand(bookmarks[0]) would return:

[<folder name="folder1">,<folder name="folder2">]

I'd also want to know when the ending tags hit too. Any ideas?

Thanks in advance!

Here is my stab at it:

>>> from BeautifulSoup import BeautifulSoup
>>> html = """<folder name="folder1">
     <folder name="folder2">
          <bookmark href="link.html">
     </folder>
</folder>
"""
>>> soup = BeautifulSoup(html)
>>> bookmarks = soup.find_all('bookmark')
>>> [p.get('name') for p in bookmarks[0].find_all_previous(name = 'folder')]
[u'folder2', u'folder1']

The key difference from @eumiro's answer is that I am using find_all_previous instead of find_parents. When I tested @eumiro's solution I found that find_parents only returns the first (immediate) parent as the name of the parent and grandparent are the same.

>>> [p.get('name') for p in bookmarks[0].find_parents('folder')]
[u'folder2']

>>> [p.get('name') for p in bookmarks[0].find_parents()]
[u'folder2', None]

It does return two generations of parents if the parent and grandparent are differently named.

>>> html = """<folder name="folder1">
     <folder_parent name="folder2">
          <bookmark href="link.html">
     </folder_parent>
</folder>
"""
>>> soup = BeautifulSoup(html)
>>> bookmarks = soup.find_all('bookmark')
>>> [p.get('name') for p in bookmarks[0].find_parents()]
[u'folder2', u'folder1', None]

bookmarks[0].findParents('folder') will return you a list of all parent nodes. You can then iterate over them and use their name attribute.

继续阅读：html-parsing python xml-parsing

How do I get a list of all parent tags in BeautifulSoup?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？