具有指定URL的cURL有效，但preg_match URL失败 - cURL with specified URLs works, with preg_match URLs fails

我使用的网站在您访问时存储两个cookie（ASP.NET_SessionID和__RequestVerificationToken_XXXXXXXXX）。

该页面由一个带有pdf链接的div和一个带有"pdf查看器"源代码的iframe组成。

我正在尝试使用cURL来检索这两个cookie，然后下载pdf。我发现我必须在cURL中设置几个选项。然而，我仍然无法下载pdf。

我现在的设置是：

点击主页，（a）保存ASP.NET_SessionID cookie，（b）从iframe中找到"pdf查看器"URL，（c）找到pdf下载URL
点击"pdf查看器"URL并保存__RequestVerificationToken_XXXXXXXXX cookie
从步骤1和2中创建cookie头
使用cURL、pdf下载URL和发送cookie头下载文件

然而，我的文件结果只是一个登录页面。

第一个cURL:

$agent= 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:36.0) Gecko/20100101 Firefox/36.0';
$report_url = "[my_main_url_here]";
$ch1 = curl_init($report_url);
curl_setopt($ch1, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch1, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch1, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch1, CURLOPT_HEADER, true);
curl_setopt($ch1, CURLOPT_SSLVERSION, 4);
curl_setopt($ch1, CURLOPT_USERAGENT, $agent);
curl_setopt($ch1, CURLOPT_SSL_CIPHER_LIST, 'AES128-SHA:RC2-CBC-MD5');
curl_setopt($ch1, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch1, CURLOPT_HEADER, 1);
curl_setopt($ch1, CURLOPT_VERBOSE, true);
curl_setopt($ch1, CURLOPT_NOBODY, false);
$output1 = curl_exec($ch1);
curl_close($ch1);

我使用preg_match找到pdf下载链接：

preg_match("/'/ReportID=.{30}/", $output1, $pdf_link);
$pdf_viewer_full = "https://gate.aon.com" . $pdf_link[0];

然后我点击pdf查看器URL获得第二个cookie：

$ch2 = curl_init($viewer_url_full);
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch2, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch2, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch2, CURLOPT_HEADER, true);
curl_setopt($ch2, CURLOPT_SSLVERSION, 4);
curl_setopt($ch2, CURLOPT_USERAGENT, $agent);
curl_setopt($ch2, CURLOPT_SSL_CIPHER_LIST, 'AES128-SHA:RC2-CBC-MD5');
curl_setopt($ch2, CURLOPT_HEADER, 1);
curl_setopt($ch2, CURLOPT_VERBOSE, true);
curl_setopt($ch2, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch2, CURLOPT_NOBODY, false);
$output2 = curl_exec($ch2);
curl_close($ch2);

然后我从这两个文件的标题中取出cookie：

preg_match("/ASP.NET_SessionId=......................../", $output1, $cookie1);
preg_match("/__RequestVerificationToken_.{145}/", $output2, $cookie2);
$cookies = 'Cookie: ' . $cookie1[0] . '; ' . $cookie2[0];

然后尝试下载文件：

$headers = array ($cookies);
$file = fopen ('Report.pdf', 'w+');
$ch3 = curl_init($pdf_link_full);
curl_setopt($ch3, CURLOPT_SSL_CIPHER_LIST, 'AES128-SHA:RC2-CBC-MD5');
curl_setopt($ch3, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch3, CURLOPT_FILE, $file);
curl_setopt($ch3, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch3, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch3, CURLOPT_SSLVERSION, 4);
curl_setopt($ch3, CURLOPT_USERAGENT, $agent);
curl_setopt($ch3, CURLOPT_COOKIEFILE, "cookie.txt");
$output3 = curl_exec($ch3);
curl_close($ch3);

编辑：如果我手动设置$pdf_link_full，它会工作。但是，如果我在preg_match中找到它（如上所述），它就会失败。

然而，如果我打印$pdf_link_full和$pdf_link_full_2，它们看起来完全一样。我这里缺少编码还是其他什么？谢谢

问题出在我的preg_match上。它返回了一个带有&的URL，当我手动设置它时，我只使用了与号（&）。

用&替换&解决了此问题。