wikipedia api: get parsed introduction only
Using PHP, is there a nice way to开发者_如何学运维 get the (parsed) introduction only from a wikipedia page?
I have to current methods:
- The first is to call the api page and return, then call the Wiki parser on the introduction I have pulled from the first request (two requests, extracting the intro from the text isn't pretty either).
- The second is to call the entire page parser and use
xpath
to retrieve every<p>
tag before the contents table.
With both methods I then have to re-parse the HTML to ensure the relevant links inside the introduction link off to wikipedia.
Neither are ideal really, there must be a better way?
- http://www.mediawiki.org/wiki/API:Parsing_wikitext
- http://en.wikipedia.org/w/api.php
The action=parse
API module accepts a section number parameter, like this. The lead is section number 0.
精彩评论