Using BeautifulSoup's findAll to search html element's innerText to get same result as searching attributes?
For instance if I am searching by an element's attribute like id:
soup.findAll('span',{'id':re.compile("^score_")})
I get back a list of the whole span element that matches (which I like).
But if I try to search by the innerText of the html element like this:
soup.findAll('a',text = re.compile("discuss|comment"))
I get back only the innerText part of element back that matches instead of the whole element with tags and attributes l开发者_Go百科ike I would above.
Is this possible to do with out finding the match and then getting it's parent?
Thanks.
You don't get back the text. You get a NavigableString
with the text. That object has methods to go to the parent, etc.
from BeautifulSoup import BeautifulSoup
import re
soup = BeautifulSoup('<html><p>foo</p></html>')
r = soup.findAll('p', text=re.compile('foo'))
print r[0].parent
prints
<p>foo</p>
精彩评论