python beautifulsoup adding extra end tags
I'm using Beautifulsoup to parse a website
开发者_开发知识库 request = urllib2.Request(url)
response = urllib2.urlopen(request)
soup = BeautifulSoup.BeautifulSoup(response)
I am using it to traverse a table. The problem I am running into is that BS is adding an extra end tag for the table into the html which doesn't exist, which I verified with: print soup.prettify(). So, one of the td tags is getting left out of the table and I can't select it.
How about searching directly for each tag instead of trying to traverse into the table?
for td in soup.find("td"):
...
its not unusual to find the tbody tag nested within a table automatically when its not in the code. Either you can code for it or just jump straight to the tr or td tag.
精彩评论