开发者

How to use lxml to get a message from a website?

At exam.com is not about the weather:

Tokyo: 25°C

I want to use 开发者_StackOverflow中文版Django 1.1 and lxml to get information at the website. I want to get information that is of "25" only.

HTML exam.com structure as follows:

<p id="resultWeather">
    <b>Weather</b>
    Tokyo:
    <b>25</b>°C
</p>

I'm a student. I'm doing a small project with my friends. Please explain to me easily understand. Thank you very much!


BeautifulSoup is more suitable for html parsing than lxml.

something like this can be helpful:

def get_weather():
    import urllib
    from BeautifulSoup import BeautifulSoup
    data = urllib.urlopen('http://exam.com/').read()
    soup = BeautifulSoup(data)
    return soup.find('p', {'id': 'resultWeather'}).findAll('b')[-1].string

get page contents with urllib, parse it with BeautifulSoup, find P with id=resultWeather, find last B in our P and get it's content

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜