开发者

access password protected website programmatically

The login form used on the site is /login.php?action=process and it uses POST. How would I begin to write something, preferably with php that will login with my username and password. Then I will proceed to crawl and get the info that I need.

This is to monitor/update info for a suppliers e-commerce store so开发者_Python百科 that my inventory and pricing stays up to date on my site.


$loginUrl = 'http://www.remote_site.com/login.php?action=process';
$loginFields = array('username' => 'username', 'password' => 'password');

getUrl($loginUrl, 'post', $loginFields); 
//now you're logged in and a session cookie was generated

$remote_page_content = getUrl('http://www.remote_site.com/some_page.php');


  function getUrl($url, $method='', $vars='') {
    $ch = curl_init();
    if ($method == 'post') {
      curl_setopt($ch, CURLOPT_POST, 1);
      curl_setopt($ch, CURLOPT_POSTFIELDS, $vars);
    }
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt');
    $buffer = curl_exec($ch);
    curl_close($ch);
    return $buffer;
  }

From the login-page, I assume the shopsystem is (some sort of) xt:commerce. It has a function to export product information as CSV, so, as vaidas said in the comments, you should try to get that CSV emailed before trying to 'crawl' the site.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜