为什么不是';t CURL登录外部网站


Why isn't CURL logging into external website?

我一直在反复尝试,但到目前为止,我无法使用CURL登录Pinterest,我无法理解为什么。。

function pinLogin()
{   
    $login_post     = array(
        'source_url' => '/login/',
        'data' => '{
            "options":{
                "username_or_email":"email",
                "password":"password"
                },
            "context":{}}',
        'module_path' => 'App()>LoginPage()>Login()>Button(text=Log In, size=large, class_name=primary, type=submit)',
    );
    $httpheaders    = array(
       'Connection: keep-alive',
       'Pragma: no-cache',
       'Cache-Control: no-cache',
       'Content-Type: application/x-www-form-urlencoded; charset=UTF-8',
       'User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:35.0) Gecko/20100101 Firefox/35.0',
       'Accept: application/json, text/javascript, */*; q=0.01',
       'Accept-Language: en-US,en;q=0.5',
       'Accept-Encoding: gzip, deflate',
    );
    $login_header   = array(
        'X-Pinterest-AppState: active',
        'X-NEW-APP: 1',
        'X-APP-VERSION: 71854ca',
        'X-Requested-With: XMLHttpRequest',
        'Accept: application/json, text/javascript, */*; q=0.01'
    );
    // request home page to establish cookies and a session, set curl options
        $ch = curl_init('http://www.pinterest.com/');
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
        curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
        curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
        curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
        curl_setopt($ch, CURLOPT_VERBOSE, 1);
        curl_setopt($ch, CURLOPT_STDERR, fopen('/tmp/debug.txt', 'w+'));
        curl_setopt($ch, CURLOPT_HEADER, 1);
        curl_setopt($ch, CURLOPT_HTTPHEADER, $httpheaders);
        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------
    // parse the csrf token out of the cookies to set later when logging in
        list($headers, $body) = explode("'r'n'r'n", $data, 2);
        preg_match('/csrftoken=(.*?)['b;'s]/i', $headers, $csrf_token);
        // next request the login page
        curl_setopt($ch, CURLOPT_URL, 'http://www.pinterest.com/login/');
        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------
    // perform login post    
        $login_header[] = 'X-CSRFToken: ' . $csrf_token[1];
        curl_setopt($ch, CURLOPT_URL, 'http://www.pinterest.com/resource/UserSessionResource/create/');
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $login_post);
        curl_setopt($ch, CURLOPT_HTTPHEADER, array_merge($httpheaders, $login_header));
        curl_setopt($ch, CURLOPT_REFERER, 'http://www.pinterest.com/login/');
        curl_setopt($ch, CURLOPT_HEADER, 0);
        $data = curl_exec($ch);
    // ----------------------------------------------------------------------------

    if (curl_getinfo($ch, CURLINFO_HTTP_CODE) != 200)
    {
        echo "Error logging in.<br />";
        var_dump(curl_getinfo($ch));
    } else {
        $response = json_decode($data, true);
        if ($response === null)
        {
            echo "Failed to decode JSON response.<br /><br />";
            var_dump($response);
        } else if ($response['resource_response']['error'] === null) {
            echo "Logged in..";
        }
        print_r($response);
    }
}

我试图模仿发送到pinterest的相同标头,但由于某种原因,我仍然无法登录。。

https://www.pinterest.com/resource/UserSessionResource/create/
POST /resource/UserSessionResource/create/ HTTP/1.1
Host: www.pinterest.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: application/json, text/javascript, */*; q=0.01
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Pinterest-AppState: active
X-CSRFToken: suv5Dm0MHGc3tWY4GTPHzgBjYSXo94xt
X-NEW-APP: 1
X-APP-VERSION: 71854ca
X-Requested-With: XMLHttpRequest
Referer: https://www.pinterest.com/login/?next=https%3A%2F%2Fwww.pinterest.com%2F%3Fusername%3DUSER&prev=https%3A%2F%2Fwww.pinterest.com%2F%3Fusername%3DUSER
Content-Length: 456
Cookie: __utma=229774877.1495817695.1423754956.1424404967.1424434787.45; __utmz=229774877.1424125793.30.5.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); csrftoken=suv5Dm0MHGc3tWY4GTPHzgBjYSXo94xt; _pinterest_sess=TWc9PSZmWTFLSWM5cGx5aEhiM0ZTdHR2R21xS2JMVlVPejZYV1lMZWZadXBtak9icVlaRjdKZGozMU5vY3k4ZXRVUjZCQS90aFI0NndIeTNWWnR5RkVHY0VtSlM1UHRIZm01UFNGY093OHk0US9GRGY5Qk1FT0JsVEZjdTVSMDA5ODdPZUhhd2tvcWJVc3hqYmlNdG9PLytMQXc9PSZ5RXRjOUdvZFI0L1hoWTVFMnlsb2lNKzRSTW89; _b="AQ1q3LoHG1dIHash9bxk4SiJLwh9Pie2j1AhDB2OYuDFJcwxnUdVLzs9hLcTSKS53mU="; _pinterest_pfob=disabled; c_dpr=1; __utmb=229774877.28.4.1424435987021; __utmc=229774877; __utmt=1; logged_out=True; fba=True; GCSCE_5B243246522C4B23F685F2EB9D5F3C78DF8A0272_S3=C=694505692171-31closf3bcmlt59aeulg2j81ej68j6hk.apps.googleusercontent.com:S=c313ffc1a154b200119a21be80be878b703de85b.BK7j4ooMbUBBATCa.2d62:I=1424435991:X=1424522391
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
source_url=/login/
&data=
{
    "options":
    {
        "username_or_email":"EMAIL@EMAIL.COM",
        "password":"PASSWORD1GOES2HERE"
    },
    "context":{}
}
&module_path=App()>LoginPage()>Login()>Button(text=Log In, size=large, class_name=primary, type=submit)

我不确定为什么你的代码不起作用,但我很确定array_merge会弄乱数字键(如果有的话)。。并且您没有正确处理X-CSRFToken标头(它在几个地方发生了变化,您只检查了一次)。。无论如何,在没有api的情况下完成这项工作并不像看上去那么容易,但是这在2015年2月22日起生效,但要小心用户名/密码,因为我可能没有正确地转义它(应该以某种方式用json_encode()转义它)

编辑:更新代码,以便您在最后一次请求时获得登录的HTML。(毫无疑问,这证明你实际上已经登录了;)我检查它的方式是将输出base64_encode(),然后在我的浏览器中运行这个javascript:document.body.uterHTML=atob("base64"),然后我看到了相同的"您已登录"屏幕)

<?php
error_reporting(E_ALL);
set_error_handler("exception_error_handler");
function exception_error_handler($errno, $errstr, $errfile, $errline ) {
    if (!(error_reporting() & $errno)) {
        // This error code is not included in error_reporting
        return;
    }
    throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
}

$curlh=hhb_curl_init(array(
CURLOPT_USERAGENT=>"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36"
,CURLOPT_HEADER=>true
)
);
$username="f327410@trbvm.com";
$password="f327410@trbvm.compassword";
$matches=array();
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F');//get session cookie and stuff (should be handled by curl automatically)
preg_match("/csrftoken'=([^';]*)/",$info,$matches);
$CSRFToken=$matches[1];
curl_setopt_array($curlh,array(
CURLOPT_URL=>'https://www.pinterest.com/resource/UserSessionResource/create/'
,CURLOPT_POST=>true
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    'Accept:application/json, text/javascript, */*; q=0.01',
    'Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2',
    'Connection:keep-alive',
    //TODO: Content-Length:414
    'Content-Type:application/x-www-form-urlencoded; charset=UTF-8',
//Cookie:csrftoken=wu1TXmJFeCD1q5scixeeK8QFkHSIIXg1; _pinterest_sess=TWc9PSZIbitpRE1Ka2tuRmNXTGNHY3NXQS9reXVvNENxdytpM3BkMCswNldrOUk5WDRucEk5UldYWEIwUERlWG84YXFOT1VrdlRiVHVIMUxTMkthM3hrYTZLTkM0NWJHQzFiQzVvdUQ5Ynp1Q255OUFBOEFVOWFpSzh4NHo2SC9RcTJ5M3NiNEt3YmliTmR2YTRyb0RPMlN3elE9PSZxUWtoVkZ3c0xXYkhMNEtYQVZBWXY5ak1Ec2s9; c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
    'Host:www.pinterest.com',
    'Origin:https://www.pinterest.com',
    'Referer:https://www.pinterest.com/',
    'X-APP-VERSION:7c24931',
    'X-CSRFToken:'.$CSRFToken,
    'X-NEW-APP:1',
    'X-Pinterest-AppState:active',
    'X-Requested-With:XMLHttpRequest',
    )
,CURLOPT_POSTFIELDS=>
'source_url='.rawurlencode('/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F').
'&data='.rawurlencode('{"options":{"username_or_email":"'.$username.'","password":"'.$password.'"},"context":{}}').
//not sure if username/password is escaped correctly.
'&module_path='.rawurlencode('App()>LoginPage()>Login()>Button(text=Logg inn, size=large, class_name=primary, type=submit)')
));
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/resource/UserSessionResource/create/');;
$matches=array();
preg_match("/csrftoken'=([^';]*)/",$info,$matches);
$CSRFToken=$matches[1];
//var_dump(__LINE__,$matches,$info);die();
//^this is interesting..
curl_setopt_array($curlh,array(
CURLOPT_URL=>"https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/"
,CURLOPT_POST=>true
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    "Origin:https://www.pinterest.com",
    "Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2",
    "Accept:application/json, text/javascript, * /*; q=0.01",
    "X-Requested-With:XMLHttpRequest",
    "X-NEW-APP:1",
    "X-APP-VERSION:7c24931",
    "X-Pinterest-AppState:active",
    "Referer:https://www.pinterest.com/",
    "Connection:keep-alive",
    //TODO: Content-Length:358
    "Content-Type:application/x-www-form-urlencoded; charset=UTF-8",
    "Host:www.pinterest.com",
    "X-CSRFToken:".$CSRFToken//TODO: verify that the token has not changed.
    )
,CURLOPT_POSTFIELDS=>
'source_url='.rawurlencode('/login/?next=https%3A%2F%2Fwww.pinterest.com%2F&prev=https%3A%2F%2Fwww.pinterest.com%2F').
'&data='.rawurlencode('{"options":{"action":"setting_new_window_location"},"context":{}}').
//not sure if username/password is escaped correctly.
'&module_path='.rawurlencode('App()>LoginPage()>Login()>Button(text=Logg inn, size=large, class_name=primary, type=submit)')
));
$info=hhb_curl_exec($curlh,'https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/');
//var_dump(__LINE__,$info);die();
//now we should be logged in! :D
curl_setopt_array($curlh,array(
CURLOPT_URL=>"https://www.pinterest.com/resource/UserRegisterTrackActionResource/update/"
,CURLOPT_POST=>false
,CURLOPT_ENCODING=>"gzip, deflate"
,CURLOPT_HTTPHEADER=>array(
    "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    "Accept-Language:nb-NO,nb;q=0.8,no;q=0.6,nn;q=0.4,en-US;q=0.2,en;q=0.2",
    "Connection:keep-alive",
    "Host:www.pinterest.com",
    "Referer:https://www.pinterest.com/"
    )
));
/*
//fuckthis Accept-Encoding:gzip, deflate, sdch
//Cookie:c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _b="AQ3m6m5qQAVDaIkyqRoJYJ9ecazmK4aobP3PczTxb/BtXObCwlC/5kusK9/Ymj2luo8="; csrftoken=EitE4BCiLq3sz0hf5lHtCx6uNvyIaalo; _pinterest_sess="TWc9PSZLclVramZrWGRUMVYzZW1ZbmxXTXFXeWpHU2ZOVFBFNmFUOXU2ZlNJWFJ0TkdzTy9TZ1RQdmxtNmxZa0JNWnliR2VRS2t6UTRZVGtSVnNySlJKRVBEUjh4K3FWR2gxYi8yS3AxTmhqWW9COWZWaGJiK1Q2Y29ydjhLSGRDb2srdTdHaVh6RU12SEZnVmxlM09UNEloQU9JKzQxRDNqOFlISHRHZ0hIVW9kTUttWlhEd1BOaTJnbHZYTDZ5enBRSGtubDJaSnNKSjlzaG9SaWsrMFZaenhLeWpVaElxbTdZOG1sa3ZGeWQ3MWNFQy81YmtHQkxsZDlBQVNEK1FTUUJEYWZqV2tUMzVDVVM4R1VXL0lCOHZ3MHhPcC81YVZjOWRnSkZoTXFVQXRLU21OK05PZmtFczNvY2ZGdVRMS2pWdXR0WG8wakJTeTdYNlRqV3NZVmtHQzBsM0VyVnhVeXIzVkRWdXlqT3Q1eFVqWUJwVkxuR1ZwY3M5YXJBU2xKQ0lua3U1UkRxYWk0c0lVR1lJcHpMOUZNQXo0ZWlRSDRlaGVSa3NUaEFnREl2Q2lvN0xQc05DNjk5emNESDdaM3YxVmFwNU9KVFhLUGJBVStLcVZjVk1pMlREa3JzcW1FWEdSMGF0cXdvTlpGaz0mYmpVenZYQk1UY0xsN0Y1ZXRzTGhLV2FyRThRPQ=="
*/
$info=hhb_curl_exec($curlh,'https://www.pinterest.com');
var_dump(__LINE__,$info);die();
/*    
//Cookie:c_dpr=1; __utmt=1; __utma=229774877.1252202543.1424620619.1424620619.1424620619.1; __utmb=229774877.5.7.1424620619; __utmc=229774877; __utmz=229774877.1424620619.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _b="AQ3m6m5qQAVDaIkyqRoJYJ9ecazmK4aobP3PczTxb/BtXObCwlC/5kusK9/Ymj2luo8="; csrftoken=EitE4BCiLq3sz0hf5lHtCx6uNvyIaalo; _pinterest_sess="TWc9PSZLclVramZrWGRUMVYzZW1ZbmxXTXFXeWpHU2ZOVFBFNmFUOXU2ZlNJWFJ0TkdzTy9TZ1RQdmxtNmxZa0JNWnliR2VRS2t6UTRZVGtSVnNySlJKRVBEUjh4K3FWR2gxYi8yS3AxTmhqWW9COWZWaGJiK1Q2Y29ydjhLSGRDb2srdTdHaVh6RU12SEZnVmxlM09UNEloQU9JKzQxRDNqOFlISHRHZ0hIVW9kTUttWlhEd1BOaTJnbHZYTDZ5enBRSGtubDJaSnNKSjlzaG9SaWsrMFZaenhLeWpVaElxbTdZOG1sa3ZGeWQ3MWNFQy81YmtHQkxsZDlBQVNEK1FTUUJEYWZqV2tUMzVDVVM4R1VXL0lCOHZ3MHhPcC81YVZjOWRnSkZoTXFVQXRLU21OK05PZmtFczNvY2ZGdVRMS2pWdXR0WG8wakJTeTdYNlRqV3NZVmtHQzBsM0VyVnhVeXIzVkRWdXlqT3Q1eFVqWUJwVkxuR1ZwY3M5YXJBU2xKQ0lua3U1UkRxYWk0c0lVR1lJcHpMOUZNQXo0ZWlRSDRlaGVSa3NUaEFnREl2Q2lvN0xQc05DNjk5emNESDdaM3YxVmFwNU9KVFhLUGJBVStLcVZjVk1pMlREa3JzcW1FWEdSMGF0cXdvTlpGaz0mYmpVenZYQk1UY0xsN0Y1ZXRzTGhLV2FyRThRPQ=="

Response Headersview source
Accept-Ranges:bytes
Cache-Control:no-cache, no-store, must-revalidate, max-age=0
Connection:keep-alive
Content-Encoding:gzip
Content-Length:348
Content-Type:application/json; charset=utf-8
Date:Sun, 22 Feb 2015 15:57:42 GMT
Expires:Thu, 01 Jan 1970 00:00:00 GMT
Pinterest-Breed:CORGI
Pinterest-Generated-By:ngapp2-1af98e48
Pinterest-Version:7c24931
Pragma:no-cache
Server:nginx
Set-Cookie:_pinterest_pfob=disabled; Domain=.pinterest.com; expires=Wed, 21-Feb-2018 15:57:42 GMT; Max-Age=94607999; Path=/
Vary:User-Agent, Accept-Encoding



 */



function hhb_curl_init($custom_options_array = array()) {
    if(empty($custom_options_array)){
        $custom_options_array=array();
//i feel kinda bad about this.. argv[1] of curl_init wants a string(url), or NULL
//at least i want to allow NULL aswell :/
    }
    if (!is_array($custom_options_array)) {
        throw new InvalidArgumentException('$custom_options_array must be an array!');
    };
    $options_array = array(
        CURLOPT_AUTOREFERER => true,
        CURLOPT_BINARYTRANSFER => true,
        CURLOPT_COOKIESESSION => true,
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_FORBID_REUSE => false,
        CURLOPT_HTTPGET => true,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_SSL_VERIFYPEER => false,
        CURLOPT_CONNECTTIMEOUT => 10,
        CURLOPT_TIMEOUT => 11,
        CURLOPT_ENCODING=>"",
        CURLOPT_REFERER=>'example.org',
        CURLOPT_USERAGENT=>'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.146 Safari/537.36'
    );
    if (!array_key_exists(CURLOPT_COOKIEFILE, $custom_options_array)) {
        //do this only conditionally because tmpfile() call..
         static $curl_cookiefiles_arr=array();//workaround for https://bugs.php.net/bug.php?id=66014
     $curl_cookiefiles_arr[]=$options_array[CURLOPT_COOKIEFILE] = tmpfile();
     $options_array[CURLOPT_COOKIEFILE] =stream_get_meta_data($options_array[CURLOPT_COOKIEFILE]);
     $options_array[CURLOPT_COOKIEFILE]=$options_array[CURLOPT_COOKIEFILE]['uri']; 
    }
    //we can't use array_merge() because of how it handles integer-keys, it would/could cause corruption
    foreach($custom_options_array as $key => $val) {
        $options_array[$key] = $val;
    }
    unset($key, $val, $custom_options_array);
    $curl = curl_init();
    curl_setopt_array($curl, $options_array);
    return $curl;
}
$hhb_curl_domainCache = "";
function hhb_curl_exec($ch, $url) {
    global $hhb_curl_domainCache; //
    //$hhb_curl_domainCache=&$this->hhb_curl_domainCache;
    //$ch=&$this->curlh;
        if(!is_resource($ch) || get_resource_type($ch)!=='curl')
    {
        throw new InvalidArgumentException('$ch must be a curl handle!');
    }
    if(!is_string($url))
    {
        throw new InvalidArgumentException('$url must be a string!');
    }
    $tmpvar = "";
    if (parse_url($url, PHP_URL_HOST) === null) {
        if (substr($url, 0, 1) !== '/') {
            $url = $hhb_curl_domainCache.'/'.$url;
        } else {
            $url = $hhb_curl_domainCache.$url;
        }
    };
    curl_setopt($ch, CURLOPT_URL, $url);
    $html = curl_exec($ch);
    if (curl_errno($ch)) {
        throw new Exception('Curl error (curl_errno='.curl_errno($ch).') on url '.var_export($url, true).': '.curl_error($ch));
        // echo 'Curl error: ' . curl_error($ch);
    }
    if ($html === '' && 203 != ($tmpvar = curl_getinfo($ch, CURLINFO_HTTP_CODE)) /*203 is "success, but no output"..*/ ) {
        throw new Exception('Curl returned nothing for '.var_export($url, true).' but HTTP_RESPONSE_CODE was '.var_export($tmpvar, true));
    };
    //remember that curl (usually) auto-follows the "Location: " http redirects..
    $hhb_curl_domainCache = parse_url(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), PHP_URL_HOST);
    return $html;
}

您可以在此处看到代码的实时运行:http://codepad.viper-7.com/D8qk6q(几天内,直到服务器删除代码。或者直到某个互联网白痴更改密码。显然,这是一个一次性帐户)

我敢肯定,如果不获得所需的request_identifier,这是行不通的。为了解释,当你加载页面时,你会得到一个"会话"的唯一编号,当你要登录时会进行比较。这是为了避免CSRF(跨站点请求伪造)。如果您检查实际的POST,您会注意到不仅发布了用户名或密码,还发布了一些项目。

我认为应该使用https而不是http

$ch = curl_init('https://www.pinterest.com/'); // <-- HERE

并评论这一行:

// $login_header[] = 'X-CSRFToken: ' . $csrf_token[1];