php/curl not returning correct data
Here's a small sample of some test code that simply goes to
http://www.un.org/apps/news/story.asp?NewsID=37180&Cr=Haiti&Cr1=
and pulls in the specified web page.
<?php
$url = "http://www.un.org/apps/news/story.asp?NewsID=37180&Cr=Haiti&Cr1=";
$curl = curl_init(); // initialize curl handle
curl_setopt($curl, CURLOPT_URL, $url); // set url to post to
curl_setopt($curl, CURLOPT_FAILONERROR, 1);
curl_setopt($curl, CURLOPT_COOKIESESSION, TRUE); // since we reuse now
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);// allow redirects
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); // return into a variable
curl_setopt($curl, CURLOPT_TIMEOUT, 20); // times out after 20 seconds
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1; U; en; rv:1.8.0) Gecko/20060728 Firefox/1.5.0" );
$result = curl_exec($curl); // run the whole process
print $result;
When I开发者_Go百科 look at the result, however, it's not quite what I'm wanting. If you look in the results for the string
"United Nations humanitarian officials are calling for ?massive mobilization activities? in Haiti"
you can see the two question marks surrounding the text "massive mobilization activities".
If you go to the actual website, the question marks are rendered as a pair of left- and right- quotation marks, and this is reflected when you view the source code from the site ...
"United Nations humanitarian officials are calling for “massive mobilization activities” in Haiti"
I'd like to know how I can grab the double quotes rather than the question marks that I'm seeing.
All suggestions gratefully accepted.
And happy new year to y'all
Nothing to do with PHP, nothing to do with curl, not even an error. Those "question marks" you mention are ASCII characters 0x93 and 0x94: the open double quotes and the close double quotes. I'm not a PHP guy but if you want regular double quotes
str_replace(array(chr(0x93), chr(0x94)),'"',$result)
should fix you right up.
It looks like that " used in above example is a special character rather than normal ". view page source and copy past source into notepad if it shows you ? instead of " it means it is a special character, and you need to figure out the exact code for that character
精彩评论