Curl:传输已结束,剩余未完成的读取数据


Curl: transfer closed with outstanding read data remaining

我遇到了一个大型curl调用的问题。

我得到

  • nread<=0,服务器关闭连接,正在保释
  • 传输已关闭,剩余未完成的读取数据

并且内容是部分交付的

    GET /stats/?stats_breakdown=track__track&campaign=&search_criteria=2&period=0&date_month=11&date_day=03&date_year=2015&start_date_month=11&start_date_day=03&start_date_year=2015&end_date_month=12&end_date_day=31&end_date_year=2014 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13
Host: domain.com
Accept: */*
Cookie: sessionid=xxg4gglsm7o3b224wihqz8od19wl31h1; csrftoken=JBpLxNtgAVvDEw2wNqvBnRmzDJIjxL6C
Cache-Control: no-cache
Connection: Keep-Alive
Keep-Alive: 600
Accept-Language: en-us
X-CSRFToken: SeN9bHryRK8FWLTLJIs5c6u9AZ47a8pR
Content-Type: application/x-www-form-urlencoded
Origin: https://domain.com
Referer: https://domain.com
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* HTTP 1.1 or later with persistent connection, pipelining supported
< HTTP/1.1 200 OK
< Server: nginx/1.8.0
< Date: Wed, 04 Nov 2015 12:54:05 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Vary: Accept-Encoding
< Vary: Cookie, Accept-Language
< P3P: CP="ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI"
< Content-Language: en
* Replaced cookie csrftoken="JBpLxNtgAVvDEw2wNqvBnRmzDJIjxL6C" for domain domain.com, path /, expire 1478091245
< Set-Cookie: csrftoken=JBpLxNtgAVvDEw2wNqvBnRmzDJIjxL6C; expires=Wed, 02-Nov-2016 12:54:05 GMT; Max-Age=31449600; Path=/; secure
<
* nread <= 0, server closed connection, bailing
* transfer closed with outstanding read data remaining
* Closing connection #0

这是我使用的php配置

function getHeaders()
{
    $headers = array();
    $headers[] = 'Cache-Control: no-cache';
    $headers[] = 'Connection: Keep-Alive';
    $headers[] = 'Keep-Alive: 600';
    $headers[] = 'Accept-Language: en-us';
    $headers[] = 'X-CSRFToken: SeN9bHryRK8FWLTLJIs5c6u9AZ47a8pR';
    $headers[] = 'Content-Type: application/x-www-form-urlencoded';
    $headers[] = 'Origin: https://domain.com';
    $headers[] = 'Referer: https://domain.com';
    return $headers;
}
curl_setopt($connection, CURLOPT_URL, $url);
    curl_setopt($connection, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($connection, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($connection, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
    curl_setopt($connection, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($connection, CURLOPT_SSL_VERIFYHOST, false);
    curl_setopt($connection, CURLOPT_COOKIEFILE, 'cookie.txt');
    curl_setopt($connection, CURLOPT_COOKIEJAR, 'cookie.txt');
    curl_setopt($connection, CURLOPT_CONNECTTIMEOUT ,550000000);
    curl_setopt($connection, CURLOPT_TIMEOUT, 5500000000); //timeout in seconds
    curl_setopt($connection, CURLOPT_HTTPHEADER, getHeaders());
    curl_setopt($connection, CURLOPT_VERBOSE, 1);

遇到类似的问题,我的服务器落后于nginx。如果Curl直接连接到服务器,它可以接收响应,但如果Curl通过nginx连接到服务器时,Curl会抛出错误

会话
*传输已结束,剩余未完成的读取数据
*闭合连接0卷曲:(18)传输闭合,未完成读取剩余数据

当我使用浏览器连接到同一个nginx URL时,响应显示得很好。这很奇怪,当我试图使用curl连接到同一个nginx URL时,它抛出了上面的错误。

在比较了浏览器和curl发送的头之后。我发现浏览器能够接收响应是因为curl没有发送的以下标题:

'Accept-Encoding: gzip'

使用curl发送上面的头可以正常工作。因此,上面的头所做的是将响应压缩到gzip,从而减小响应大小。

经过进一步的挖掘,发现nginx无法发送任何大于80kb的有效载荷。在浪费了很多时间之后,发现问题出在nginx缓冲上,在nginx.conf中添加了以下proxy_braining属性后,nginx的工作非常有魅力:

 location / {
 proxy_buffering off;
 }

公认的答案并没有解决我的问题。写下这个答案,这样任何人如果面临和我一样的问题,都不需要浪费时间。

好吧,经过一些搜索和IRC聊天,我找到了解决方案,但不能100%确定原因是什么。看起来keep-alives发送的信息不够,无法保持连接。我会在这里发布解决方案,希望我能帮到别人。

对我有帮助的是添加

--keepalive-time 2

卷曲选项的解释

--keepalive-time <seconds>

此选项设置连接在发送保活探测和单个保活之间的时间探针。它目前在提供TCP_KEEPIDLE和TCP_KEPEINTVL套接字选项(表示Linux,最近AIX、HP-UX等)。如果--no keepalive为习惯于(在7.18.0中添加)

如果多次使用此选项,将使用最后一个选项。如果未指定,该选项默认为60秒。

默认值似乎太高,无法保持我的连接打开。

这是我呼叫时使用的完整命令

curl URL -H 'Accept-Encoding: gzip, deflate, sdch' -H 'Accept-Language: en-US,en;q=0.8,et;q=0.6,nl;q=0.4' -H 'Upgrade-Insecure-Requests: 1' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8' -H 'Connection: keep-alive' --compressed -v --keepalive-time 2

我正在osx 上运行这个版本的curl

curl 7.43.0 (x86_64-apple-darwin15.0) libcurl/7.43.0 SecureTransport zlib/1.2.5
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz UnixSockets

如果有人想在PHP curl中使用这个选项,那么从PHP 5.5开始就可以使用--keepalive时间选项。您可以按以下方式使用它:

curl_setopt($connection, CURLOPT_TCP_KEEPALIVE, 1);
curl_setopt($connection, CURLOPT_TCP_KEEPIDLE, 2);

希望这能帮助那些在同一问题上挣扎的人!

在我的情况下,我不得不增加PHP.ini中的PHP内存限制配置,因为响应大小高于内存限制,所以在内存分配过程中传输被截断。

libcurl only告诉您,服务器以不干净的方式切断了连接,因为它没有提供它承诺要做的数据。看起来分块编码并没有发出传输结束的信号。

浏览器因其接收的内容极为自由而臭名昭著,因此它们忽略了各种协议违规行为,并在比libcurl更大的程度上对其进行了处理。