开发者

cURL login problem

Using cURL to scrap开发者_JS百科e a secure (i.e. login) page, and I'm at my wits' end. I managed to successfully scrape two sites with little or no problems, and now I just can't log into this one. cURL gets all the pages I ask it to, but they're all not logged in, which doesn't help. So maybe someone could spot a mistake I've missed?

The code is:

$url_to = 'http://fastorder.newrock.es/store2009/index.php/customer/account/loginPost/';
$url_from = 'http://fastorder.newrock.es/store2009/index.php/customer/account/login/';
$url_get = 'http://fastorder.newrock.es/store2009/index.php/';
$name_pass = 'login%5Busername%5D=*****&login%5Bpassword%5D=*****&send=';

function login($link,$user,$from) {
    $fp = fopen("cookie.txt", "w");
    fclose($fp);
    $log = curl_init();
    curl_setopt($log, CURLOPT_REFERER, $from);
    curl_setopt($log, CURLOPT_URL, $link);
    curl_setopt($log, CURLOPT_COOKIEJAR, "cookie.txt");
    curl_setopt($log, CURLOPT_COOKIEFILE, "cookie.txt");
    curl_setopt($log, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6");
    curl_setopt($log, CURLOPT_TIMEOUT, 40);
    curl_setopt($log, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($log, CURLOPT_HEADER, TRUE);
    curl_setopt($log, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($log, CURLOPT_POST, TRUE);      
    curl_setopt($log, CURLOPT_POSTFIELDS, $user);
    $data = curl_exec($log);
    curl_close($log);
}

login($url_to,$name_pass,$url_from);

function get($url) {
    $get = curl_init();
    curl_setopt($get, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($get, CURLOPT_COOKIEFILE, "cookie.txt");
    curl_setopt($get, CURLOPT_URL, $url);
    return curl_exec ($get);
    curl_close ($get);
}

$html = get($url_get);
echo $html;

This is the (more or less) same script that worked on the other two sites, and it manages to log in fine. What threw me off in the start are the codes in the $name_pass. Turns out the site has named name and password input fields as login[username] and login[password]. Why the hell for, I've no idea, but I've tried sending it both with codes and with brackets, and nothing helped.

Live HTTP Headers is giving me the following for the page:

http://fastorder.newrock.es/store2009/index.php/customer/account/loginPost/

POST /store2009/index.php/customer/account/loginPost/ HTTP/1.1
Host: fastorder.newrock.es
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Referer: http://fastorder.newrock.es/store2009/index.php/customer/account/login/
Cookie: frontend=6tjul97q4mvn0046ier0k79li8
Content-Type: application/x-www-form-urlencoded
Content-Length: 81
login%5Busername%5D=*****&login%5Bpassword%5D=*****&send=
HTTP/1.1 302 Found
Date: Fri, 26 Feb 2010 12:29:19 GMT
Server: Apache/2.0.63 (CentOS)
X-Powered-By: PHP/5.2.10
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: http://fastorder.newrock.es/store2009/index.php/customer/account/
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8

I've tried to copy everything I could to the cURL script, thinking there's some obscure way of blocking the scrip from logging in. But right now I'm totally stuck and I've got no idea what to do next. And I've dug through a lot of tutorials, and they all give advices that worked like a charm for the first two sites.

Halp?


It may be this:

login%5Busername%5D=*****&login%5Bpassword%5D=*****&send=

I'm no curl guru, but your script seems to be OK, so maybe you should not escape the characters.

I would do local tests with curl and this kind of login forms. Maybe you can debug what's wrong from there. If I'm right, there will be empty fields.


Suggestion: Use Fiddler (www.fiddler2.com) to diff the request traffic, CURL vs your browser.


There is something broken with that store's registration/login. The activation email said to just login to activate the account. I've tried logging in multiple times but I get the error "This account is not activated." everytime I try to login.

Below is a quick change that prints the returned login page.

$url_to = 'http://fastorder.newrock.es/store2009/index.php/customer/account/loginPost/';
$url_from = 'http://fastorder.newrock.es/store2009/index.php/customer/account/login/';
$url_get = 'http://fastorder.newrock.es/store2009/index.php/';
$name_pass = 'login%5Busername%5D=*****&login%5Bpassword%5D=*****&send=';

function login($link,$user,$from) {
$fp = fopen("cookie.txt", "w");
fclose($fp);
$log = curl_init();
curl_setopt($log, CURLOPT_REFERER, $from);
curl_setopt($log, CURLOPT_URL, $link);
curl_setopt($log, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($log, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($log, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6");
curl_setopt($log, CURLOPT_TIMEOUT, 40);
curl_setopt($log, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($log, CURLOPT_HEADER, TRUE);
curl_setopt($log, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($log, CURLOPT_POST, TRUE);      
curl_setopt($log, CURLOPT_POSTFIELDS, $user);
$data = curl_exec($log);
curl_close($log);
return $data;
}

echo login($url_to,$name_pass,$url_from);

function get($url) {
$get = curl_init();
curl_setopt($get, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($get, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($get, CURLOPT_URL, $url);
return curl_exec ($get);
curl_close ($get);
}

$html = get($url_get);
echo $html;

Edit:
Is the cookies data is being written to the cookies file (cookie.txt)? If not...

  1. Check the file permissions, make sure its writable.

  2. A bug in earlier versions of php5 caused the cookies file option to be ignored.

Details on the bug are here: http://bugs.php.net/bug.php?id=33475
Solution: Add unset($log) after curl_close($log);

Its hard to debug this script w/o being able to test it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜