python beautifulsoup related problem

2023-01-19 09:07 问答作者：

i have some problem to extract some data from html source.

following is sniffit of my html source code, and i want to extract string value in every

following

<td class="gamedate">10/12 00:59</b></td>

<td class="gametype">오버언더</b></td>

<td class="legue"><nobr style="width:100%;overflow:hidden;letter-spacing:-1;font-size:11px;"><nobr style='display:block; overflow:hidden;'><img src='../data/banner/25' border='0' width='20' height='13' alt='' align='absmiddle'></a> 그리스 D2</nobr>

<td class="bet" id="team1_27771" class="homeTeam1">Pas Giannina (↑오버)</td>

<td class="bet" id="bet1_27771" class="homeTeam2" align="right">1.65</td>

<td class="pointer muSelect" id="chk_27771_3" num='27771' bet='2.5' sp='오버언더'  bgcolor="f0f0f0"  class="handy handy1" ><span id="bet3_27771">2.5</span></td>

<td class="bet" id="bet2_27771" class="awayTeam2" align="left">1.95</td>

<td class="bet" id="team2_27771" class="awayTeam1">Pierikos (↓언더)</td>

so what i want extracted final value is

10/12 00:59

오버언더

그리스 D2

Pas Giannina (↑오버)

1.65

2.5

1.95

Pierikos (↓언더)

following is my html full source

help me please! thanks in advance!

because html source 开发者_如何学JAVAis some big so i was upload to pastebin.com

http://pastebin.com/Gdun0jhf

Why not just do a replace on the string

html.replace("AAAAAA", "Put what you want for AAAAAA here")

and do this for all of the things you want to replace?

Ignore, I miss read the question completely my brain must not be on today

You may use HTMLParser

Something like this works on a basic table:

soup = BeautifulSoup.BeautifulSoup(YOUR_HTML)
table = soup.find('TABLE_ID')
for td in table.findAll('td'):
    print td.string

but it looks like the html you are dealing with is a bit messier. SO maybe it would be best to go after each of the TDs by class name? e.g.

soup = BeautifulSoup.BeautifulSoup(YOUR_HTML)

#game date
game_dates = soup.findAll('td', {class: 'gamedate' })
for game_date in game_dates:
    print game_date

#bets
bets = soup.findAll('td', {class: 'bet' })
for bet in bets:
    print bet

继续阅读：python

python beautifulsoup related problem

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？