使用PHP和Curl登录到我的网站表单


Using PHP & Curl to login to my websites form

我试图在我用来下载文件的网站上登录我的用户帐户,这样我就可以自动抓取文件,而无需访问该网站。

这是形式:

 <form method='post' action='/news.php'>
 <div>
             Username: <input class='tbox' type='text'     name='username' size='15' value='' maxlength='20' />&nbsp;&nbsp;
             Password: <input class='tbox' type='password' name='userpass' size='15' value='' maxlength='20' />&nbsp;&nbsp;
             <input type='hidden' name='autologin' value='1' />
             <input class='button' type='submit' name='userlogin' value='Login' />
 </div>
 </form>

这是我到目前为止得到的PHP。

<?php
$username="my_user"; 
$password="my_passs"; 
$url="the_url"; 
$cookie="cookie.txt"; 
$postdata = "username=".$username."&userpass=".$password; 
$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie); 
curl_setopt ($ch, CURLOPT_REFERER, $url); 
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 
echo $result;  
curl_close($ch);
?>

我做错了什么吗?它目前只显示网站,但不登录我。我以前从未使用过Curl。

谢谢

您可能需要设置 COOKIESESSION 和 COOKIEJAR 选项来保留会话并执行另一个请求:

//initial request with login data
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/login.php');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=XXXXX&password=XXXXX");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name');  //could be empty, but cause problems on some hosts
curl_setopt($ch, CURLOPT_COOKIEFILE, '/var/www/ip4.x/file/tmp');  //could be empty, but cause problems on some hosts
$answer = curl_exec($ch);
if (curl_error($ch)) {
    echo curl_error($ch);
}
//another request preserving the session
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/profile');
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
$answer = curl_exec($ch);
if (curl_error($ch)) {
    echo curl_error($ch);
}

您应该通过 POST 发送原始表单正在发送的所有数据。所以你的$postdata中缺少autologin=1&userlogin=Login.

$postdata = "username=$username&userpass=$password&autologin=1&userlogin=Login";
$postdata = "username=".$username."&userpass=".$password"; 

更改为:

$postdata = "username=".$username."&userpass=".$password;

你也有这样的吗?

$url="http://www.yourdomain.com/news.php";

还要添加此curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);

这也可能有助于:

$headers  = array();
$headers[] = 'application/xhtml+voice+xml;version=1.2, application/x-xhtml+voice+xml;version=1.2, text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1';
$headers[] = 'Connection: Keep-Alive';
$headers[] = 'Content-type: application/x-www-form-urlencoded;charset=UTF-8';
curl_setopt ($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt ($ch, CURLOPT_HEADER, 1);

在验证用户信息之前,页面可能会检查是否已设置用户登录(提交按钮)。

可能值得尝试以下方法:

$postdata = "username=".$username."&userpass=".$password . "&userlogin=Login"; 

当您请求访问之前登录的站点的页面时,您需要使用

curl_setopt ($ch, CURLOPT_COOKIEFILE, $Cookie); 

然后,您应该检查输出以确定您当前是否登录(每个站点都不同,但通常如果登录表单不可用,或者注销按钮可用,则您已登录。

如果您未登录,则不包括CURLOPT_COOKIEFILE,则包含以下行:

curl_setopt ($ch, CURLOPT_COOKIEJAR, $Cookie);

我创建了 2 个不同但相似的函数。 CurlPage()CurlLogin(). 唯一的区别是CurlPage()具有COOKIEFILE选项,CurlLogin()具有COOKIEJAR选项以及以下 2 行:

curl_setopt ($ch, CURLOPT_POSTFIELDS, $PostData);
curl_setopt ($ch, CURLOPT_POST, 1); 

然后我像这样调用函数:

$Source = CurlPage($Url, $Cookie);
if (!CheckLoggedIn($Source))
{
    CurlLogin($LoginUrl, $Cookie, $PostDataArray);
    $Source = CurlPage($Url, $Cookie);
}

请记住,某些网站需要多个页面登录。首先,您提交一个表单,然后您必须输入验证码,或单击按钮或其他内容。如果是这种情况,您的登录功能可能必须在您登录之前读取源代码并执行其他操作,并且您需要的cookie已创建并存储在cookie.txt

使用无头浏览器 - 一个真正可扩展的解决方案。(告诉我它是否适用于谷歌帐户:)

做了什么(对我自己也有用:)

  1. 安装作曲家 https://getcomposer.org(如果未安装)通过键入命令行确保已安装

         composer -V
    
  2. 在你的Web服务器目录中的某个地方创建一个文件夹,比如TryGoutte

  3. 创建一个文件 composer.json(只是为了测试作曲家):

     {
       "require": {
           "monolog/monolog": "1.0.*"
        }
     } 
    
  4. 键入"作曲家安装"。它应该安装独白。

  5. 键入"作曲家需要 fabpot/goutte"。它应该安装所有软件包"痛风"https://github.com/FriendsOfPHP/Goutte(发音为gu:t,像靴子一样)

  6. 然后,在TryGoutte中创建文件,比如try-goutte.php。

      <?php
    use Goutte'Client;
    use GuzzleHttp'Client as GuzzleClient;
    require 'vendor/autoload.php';
    $client = new 'Goutte'Client();
    // Create and use a guzzle client instance that will time out after 90 seconds
    $guzzleClient = new 'GuzzleHttp'Client(array(
    'timeout' => 90,
    // To overcome Curl SSL 60 error 
    // https://github.com/FriendsOfPHP/Goutte/issues/214
    'verify' => false,
    ));
    $client->setClient($guzzleClient);
    $crawler = $client->request('GET', 'https://github.com/');
    $crawler = $client->click($crawler->selectLink('Sign in')->link());
    $form = $crawler->selectButton('Sign in')->form();
    $crawler = $client->submit($form, array('login' => 'trygoutte', 'password' => 'trygoutte1'));
    print_r($crawler->text());
    ?>
    

享受并进一步编码!

更新:在此处实施 http://lycenok.com/site-login/programmatic-site-login.php 以检查解决方案是否适合您