开发者

Not able to Manage Session with CURL

Hope all well.

I need a small help, please.

I am trying to scrape a page with CURL (http://wap.ebay.com/Pages/ViewItem.aspx?aid=160585148382) , when this page loads, there is another link in that page (Anchor Text: Description), i want to scrape that Page too.

When you directly go to Description page (http://wap.ebay.com/Pages/ViewItemDesc.aspx?aid=280655395879&emvcc=0) putting in ur browser, it will show you error like "Session Expired or No auction details found", i think to scrape that page we need to have some session or someting.

So, first i want to Scrape http://wap.ebay.com/Pages/ViewItem.aspx?aid=280655395879 & then extract the URL in Description Button, then Prefix (http://wap.ebay.com/Pages) to it, so that it becomes a full URL, then i want to Scrape the Content of that URL.

But looks like i am not able to keep the session alive.

My code is:

<?
require_once('simple_html_dom.php');

$url = 'http://wap.ebay.com/Pages/ViewItem.aspx?aid=160585148382';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$curl_scraped_page = curl_exec($ch);
curl_close($ch);

//echo $curl_scraped_page;

$html = str_get_ht开发者_运维知识库ml($curl_scraped_page);

 // Find the img tag in the Teaser_Item div
 $a = $html->find('div[id=Teaser_Item] img', 0);

 // Display the src
 $e_image = 'http://wap.ebay.com/Pages/'.str_replace("width=57", "width=200", ($a->attr['src']));
 echo '<img src="'.$e_image.'" /> <br /><br />';
 
 
$wow = $html->find('a#ButtonMenuItem3', 0);
 
 $descurl = 'http://wap.ebay.com'.$wow->attr['href'];
 echo $descurl;
 

 exit;
 
 $html->clear();
 unset($html);


$html = file_get_html($descurl);
 
 echo $html;

 
 

$html->clear();
unset($html);
  
 
?>

Cheers Natasha


You aren't setting $cookie to a value, so CURLOPT_COOKIEFILE / CURLOPT_COOKIEJAR both are both NULL so not saving.


  $strCookie = 'PHPSESSID=' . $_COOKIE['PHPSESSID'] . '; path=/';
  session_write_close();
  $ch = curl_init($url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt( $ch, CURLOPT_COOKIE, $strCookie );
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜