Capture custom divs and div details using BeautifulSoup [closed]

2022-12-07 20:20 问答作者：

Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? Add details and clarify the problem by editing this post.

Closed 3 hours ago.

Improve this question

I am trying to capture links inside a div. In the attached screenshot I want to capture the all div's inside the "page-size mainclips".

When I search with soup.findall("div", class="page-size mainclips") I am not able to find anything?

What should I search with to get all the list of data-cliphref as highlighted in the screenshot.

How should I find the divs inside a particular div?

# create documenturl_to_scrap开发者_JS百科e="https://epaper.dishadaily.com'
html_document = getHTMLdocument(url_to_scrape)

# create soap object
soup = BeautifulSoup(html_document, 'html.parser')
linklist=[]


# find all the anchor tags with "href"
# attribute starting with "https://"
for link in soup.find_all('a',attrs={'href': re.compile("^https://")}):
    # display the actual urls
#   print(link.get('href'))
    linklist.append(link.get('href'))   

print('----------')
#print(linklist)
print(len(linklist))
substring = "latest?s="
for i in range(len(linklist)):
#   print(i)
    url_to_scrape1 = linklist[i]
    if url_to_scrape1.find(substring) != -1:
        print(url_to_scrape1)
        html_document1 = getHTMLdocument(url_to_scrape1)
        soup = BeautifulSoup(html_document1, 'html.parser')
        for each_div in soup.find_all("div", attrs={"class":'clip-box clippageview'}):
#           for each_div2 in soup.find_all('div', {'id':True}):
            print(each_div)

I am using BeautifulSoup and Python.

Capture custom divs and div details using BeautifulSoup [closed]

继续阅读：python screen-scraping web-scraping

Capture custom divs and div details using BeautifulSoup [closed]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？