PHP Curl http 请求与代理失败 95% 的次数,为什么


PHP Curl http requests with proxy fails 95% of times, why?

在php上使用curl命令,我想请求一个简单的"https://www.instagram.com"。

$curl->get("https://www.instagram.com");

当我不使用代理进行卷曲时,它通常会完成(有时卷曲第一次失败,然后重试,第二次)。

< Content-Type: text/html
< Date: Fri, 12 Jun 2015 11:35:56 GMT
< Location: https://instagram.com/
* Server nginx is not blacklisted
< Server: nginx
< Content-Length: 178
< Connection: keep-alive
<
* Ignoring the response-body
* Connection #1 to host www.instagram.com left intact
* Issue another request to this URL: 'https://instagram.com/'
* Hostname was NOT found in DNS cache
*   Trying 54.88.218.232...
* Connected to instagram.com (54.88.218.232) port 443 (#2)
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
*        server certificate verification SKIPPED
*        common name: *.instagram.com (matched)
*        server certificate expiration date OK
*        server certificate activation date OK
*        certificate public key: RSA
*        certificate version: #3
*        subject: C=US,ST=CA,L=Menlo Park,O=Instagram LLC,CN=*.instagram.com
*        start date: Tue, 14 Apr 2015 00:00:00 GMT
*        expire date: Thu, 15 Oct 2015 12:00:00 GMT
*        issuer: C=US,O=DigiCert Inc,OU=www.digicert.com,CN=DigiCert High Assurance CA-3
*        compression: NULL
*        cipher: AES-128-CBC
*        MAC: SHA1
> GET / HTTP/1.1
Host: instagram.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate
Connection: Keep-Alive
Cache-Control:no-cache
Content-type: application/x-www-form-urlencoded;charset=UTF-8
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
< HTTP/1.1 200 OK
< Cache-Control: private, no-cache, no-store, must-revalidate
< Content-Encoding: gzip
< Content-Language: en
< Content-Type: text/html
< Date: Fri, 12 Jun 2015 11:35:59 GMT
< Expires: Sat, 01 Jan 2000 00:00:00 GMT
< Pragma: no-cache
* Added cookie csrftoken="c44fef8af558485be2ff78da940bdfd6" for domain instagram.com, path /, expire 1465558381
< Set-Cookie: csrftoken=c44fef8af558485be2ff78da940bdfd6; expires=Fri, 10-Jun-2016 11:35:59 GMT; Max-Age=31449600; Path=/
* Added cookie mid="VXrEHwAEAAE0R2ZgV8aoF27i7VD7" for domain instagram.com, path /, expire 2064828781
< Set-Cookie: mid=VXrEHwAEAAE0R2ZgV8aoF27i7VD7; expires=Thu, 07-Jun-2035 11:35:59 GMT; Max-Age=630720000; Path=/
< Vary: Cookie, Accept-Language, Accept-Encoding
< X-Frame-Options: SAMEORIGIN
< Content-Length: 3049
< Connection: keep-alive
<
* Connection #2 to host instagram.com left intact
<!DOCTYPE html>
<!--[if lt IE 7]>      <html lang="en" class="no-js lt-ie9 lt-ie8 lt-ie7 not-logged-in "> <![endif]-->
<!--[if IE 7]>         <html lang="en" class="no-js lt-ie9 lt-ie8 not-logged-in "> <![endif]-->
<!--[if IE 8]>         <html lang="en" class="no-js lt-ie9 not-logged-in "> <![endif]-->
<!--[if gt IE 8]><!--> <html lang="en" class="no-js not-logged-in "> <!--<![endif]-->

问题是当我在 curl 上使用代理时,95% 的时间它挂在"在/etc/ssl/certs/ca-certificates.crt 中找到 173 个证书",然后超时。

* Rebuilt URL to: https://www.instagram.com/
* Hostname was found in DNS cache
*   Trying 104.144.1.1...
* Connected to 104.144.1.1 (104.144.1.1) port 21269 (#1)
* Establish HTTP proxy tunnel to www.instagram.com:443
* Proxy auth using Basic with user 'proxyusername'
> CONNECT www.instagram.com:443 HTTP/1.1
Host: www.instagram.com:443
Proxy-Authorization: Basic bW9oYW1tYWdoYToyazMzODczMw==
Proxy-Connection: Keep-Alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate
Connection: Keep-Alive
Cache-Control:no-cache
Content-type: application/x-www-form-urlencoded;charset=UTF-8
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
< HTTP/1.0 200 Connection established
<
* Proxy replied OK to CONNECT request
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
* Operation timed out after 0 milliseconds with 0 out of 0 bytes received
* Closing connection 1

为什么使用代理会使 curl 在 ubuntu 上以这种方式行事? 在Windows操作系统上,相同的代码执行和运行速度非常快,并获取 instagram.com 页面,但在Ubuntu上,它95%的时间失败。

以下是 curl 的选项:

curl_setopt ( $this->process, CURLOPT_HTTPHEADER, $this->headers );
curl_setopt ( $this->process, CURLOPT_HEADER, 0 );
curl_setopt ( $this->process, CURLOPT_USERAGENT, $this->user_agent );
curl_setopt ( $this->process, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt ( $this->process, CURLOPT_FOLLOWLOCATION, 1 );
curl_setopt ( $this->process, CURLOPT_POST, 0 );
curl_setopt ( $this->process, CURLOPT_ENCODING, $this->compression );
curl_setopt ( $this->process, CURLOPT_TIMEOUT, 40 );
curl_setopt ( $this->process, CURLOPT_SSL_VERIFYHOST, 0 );
curl_setopt ( $this->process, CURLOPT_SSL_VERIFYPEER, 0 );
curl_setopt ( $this->process, CURLOPT_COOKIEFILE, $this->cookie_file );
curl_setopt ( $this->process, CURLOPT_COOKIEJAR, $this->cookie_file );
curl_setopt ( $this->process, CURLOPT_VERBOSE, 1 );
// Proxy settings
curl_setopt ( $this->process, CURLOPT_PROXYTYPE, 'HTTP');
curl_setopt ( $this->process, CURLOPT_PROXY, $url);
curl_setopt ( $this->process, CURLOPT_PROXYPORT, $port);
curl_setopt ( $this->process, CURLOPT_PROXYUSERPWD, $userpass);

这是 phpinfo() 在 curl 上的结果:

cURL support => enabled
cURL Information => 7.35.0
Age => 3
Features
AsynchDNS => Yes
CharConv => No
Debug => No
GSS-Negotiate => Yes
IDN => Yes
IPv6 => Yes
krb4 => No
Largefile => Yes
libz => Yes
NTLM => Yes
NTLMWB => Yes
SPNEGO => No
SSL => Yes
SSPI => No
TLS-SRP => Yes
Protocols => dict, file, ftp, ftps, gopher, http, https, imap, imaps, ldap, ldaps, pop3, pop3s, rtmp, rtsp, smtp, smtps, telnet, tftp
Host => x86_64-pc-linux-gnu
SSL Version => GnuTLS/2.12.23
ZLib Version => 1.2.8
您的

证书似乎遇到了某种问题......尝试以下方法,它过去曾遇到过类似(但不同)的问题。

update-ca-certificates -f
apt-get install --reinstall ca-certificates

供您参考,这里是更新证书的人

更新 CA 证书是一个更新 目录/etc/ssl/certs 来保存 SSL 证书并生成 Certificates.crt,证书的串联单文件列表。

它读取文件/etc/ca-certificate.conf。每行给出一个 /usr/share/ca-certificates 下的 CA 证书的路径名 这应该是可信的。 以"#"开头的行是注释 行,因此被忽略了。 以"!"开头的行将被取消选择, 导致有问题的 CA 证书停用。 证书必须具有 .crt 扩展名才能包含在 更新 CA 证书。

此外,找到
所有扩展名为.crt的证书 下面的/usr/local/share/ca 证书也包含在 隐式信任。