PHP connecting to MediaWiki API and retrieve data

2022-12-14 09:31 问答作者：

I noticed there was a question somewhat similar to mine, only with c#:link text. Let me explain: I'm very new to the whole web-services implementation and so I'm experiencing some difficulty understanding (especially due to the vague MediaWiki API manual).

I want to retrieve the entire page as a string in PHP (XML file) and then process it in PHP (I'm pretty sure there a开发者_开发技巧re other more sophisticated ways to parse XML files but whatever): Main Page wikipedia.

I tried doing $fp = fopen($url,'r');. It outputs: HTTP request failed! HTTP/1.0 400 Bad Request. The API does not require a key to connect to it.

Can you describe in detail how to connect to the API and get the page as a string?

EDIT: The URL is $url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main Page';. I simply want to read the entire content of the file into a string to use it.

Connecting to that API is as simple as retrieving the file,

fopen

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$fp = fopen($url, 'r');
while (!feof($fp)) {
    $c .= fread($fp, 8192);
}
echo $c;

file_get_contents

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$c = file_get_contents($url);
echo $c;

The above two can only be used if your server has the fopen wrappers enabled.

Otherwise if your server has cURL installed you can use that,

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
$c = curl_exec($ch);
echo $c;

You probably need to urlencode the parameters that you are passing in the query string ; here, at least the "Main Page" requires encoding -- without this encoding, I get a 400 error too.

If you try this, it should work better (note the space is replaced by %20) :

$url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$str = file_get_contents($url);
var_dump($str);

With this, I'm getting the content of the page.

A solution is to use urlencode, so you don't have to encode yourself :

$url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=' . urlencode('Main Page');
$str = file_get_contents($url);
var_dump($str);

According to the MediaWiki API docs, if you don't specify a User-Agent in your PHP request, WikiMedia will refuse the connection with a 4xx HTTP response code:

https://www.mediawiki.org/wiki/API:Main_page#Identifying_your_client

You might try updating your code to add that request header, or change the default setting in php.ini if you have edit access to that.

继续阅读：mediawiki mediawiki-api php service

PHP connecting to MediaWiki API and retrieve data

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？