PHP cURL 是否可以在单个请求中检索响应标头和正文


Can PHP cURL retrieve response headers AND body in a single request?

有没有办法使用 PHP 获取 cURL 请求的标头和正文?我发现此选项:

curl_setopt($ch, CURLOPT_HEADER, true);

将返回正文加标头,但随后我需要解析它以获取正文。有没有办法以更可用(和安全(的方式获得两者?

请注意,对于"单个请求",我的意思是避免在 GET/POST 之前发出 HEAD 请求。

一个解决方案发布在 PHP 文档注释中:http://www.php.net/manual/en/function.curl-exec.php#80442

代码示例:

$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
// ...
$response = curl_exec($ch);
// Then, after your curl_exec call:
$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
$header = substr($response, 0, $header_size);
$body = substr($response, $header_size);

警告:如下面的注释中所述,当与代理服务器一起使用或处理某些类型的重定向时,这可能不可靠。 @Geoffrey的回答可能会更可靠地处理这些问题。

此线程提供的许多其他解决方案都无法正确执行此操作。

  • CURLOPT_FOLLOWLOCATION打开或服务器使用 100 代码 RFC-7231 MDN 响应时,'r'n'r'n拆分是不可靠的。
  • 并非所有服务器都符合标准,并且只传输新线路的'n(接收者可能会丢弃线路终止符中的'r(问答
  • 通过 CURLINFO_HEADER_SIZE 检测标头的大小也并不总是可靠的,尤其是在使用 Curl-1204 代理或在某些相同的重定向方案中时。

最正确的方法是使用 CURLOPT_HEADERFUNCTION .

这是使用 PHP 闭包执行此操作的非常干净的方法。它还将所有标头转换为小写,以便跨服务器和 HTTP 版本进行一致的处理。

此版本将保留重复的标头

这符合RFC822和RFC2616,请不要使用mb_(和类似(字符串函数,这不仅不正确,而且甚至是安全问题RFC-7230

$ch = curl_init();
$headers = [];
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// this function is called by curl for each header received
curl_setopt($ch, CURLOPT_HEADERFUNCTION,
  function($curl, $header) use (&$headers)
  {
    $len = strlen($header);
    $header = explode(':', $header, 2);
    if (count($header) < 2) // ignore invalid headers
      return $len;
    $headers[strtolower(trim($header[0]))][] = trim($header[1]);
    
    return $len;
  }
);
$data = curl_exec($ch);
print_r($headers);

Curl 有一个内置选项,称为 CURLOPT_HEADERFUNCTION。此选项的值必须是回调函数的名称。curl 会将标头(并且仅标头!(逐行传递给此回调函数(因此将从标头部分的顶部开始为每个标头行调用该函数(。然后,您的回调函数可以对其进行任何操作(并且必须返回给定行的字节数(。这是一个经过测试的工作代码:

function HandleHeaderLine( $curl, $header_line ) {
    echo "<br>YEAH: ".$header_line; // or do whatever
    return strlen($header_line);
}

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.google.com");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADERFUNCTION, "HandleHeaderLine");
$body = curl_exec($ch); 

以上适用于所有内容,不同的协议和代理,您无需担心标头大小或设置许多不同的 curl 选项。

PS:要使用对象方法处理标题行,请执行以下操作:

curl_setopt($ch, CURLOPT_HEADERFUNCTION, array($object, 'methodName'))
这是

你要找的吗?

curl_setopt($ch, CURLOPT_HTTPHEADER, array('Expect:'));
$response = curl_exec($ch); 
list($header, $body) = explode("'r'n'r'n", $response, 2);

如果你特别想要Content-Type,有一个特殊的cURL选项来检索它:

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
$content_type = curl_getinfo($ch, CURLINFO_CONTENT_TYPE);

只需设置选项:

  • CURLOPT_HEADER, 0

  • CURLOPT_RETURNTRANSFER, 1

并将curl_getinfo与CURLINFO_HTTP_CODE一起使用(或者没有 opt 参数,您将拥有一个包含所需所有信息的关联数组(

更多 : http://php.net/manual/fr/function.curl-getinfo.php

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$parts = explode("'r'n'r'nHTTP/", $response);
$parts = (count($parts) > 1 ? 'HTTP/' : '').array_pop($parts);
list($headers, $body) = explode("'r'n'r'n", $parts, 2);

在其他标头之前使用 HTTP/1.1 100 Continue

如果您需要使用仅发送 LF 而不是 CRLF 作为换行符的错误服务器,您可以使用preg_split如下:

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$parts = preg_split("@'r?'n'r?'nHTTP/@u", $response);
$parts = (count($parts) > 1 ? 'HTTP/' : '').array_pop($parts);
list($headers, $body) = preg_split("@'r?'n'r?'n@u", $parts, 2);

我的方式是

$response = curl_exec($ch);
$x = explode("'r'n'r'n", $v, 3);
$header=http_parse_headers($x[0]);
if ($header=['Response Code']==100){ //use the other "header"
    $header=http_parse_headers($x[1]);
    $body=$x[2];
}else{
    $body=$x[1];
}

如果需要,请应用 for 循环并删除分解限制。

这是我

对辩论的贡献...这将返回一个数组,其中数据分开并列出标头。这是基于CURL将返回标头块[空行]数据的基础

curl_setopt($ch, CURLOPT_HEADER, 1); // we need this to get headers back
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_VERBOSE, true);
// $output contains the output string
$output = curl_exec($ch);
$lines = explode("'n",$output);
$out = array();
$headers = true;
foreach ($lines as $l){
    $l = trim($l);
    if ($headers && !empty($l)){
        if (strpos($l,'HTTP') !== false){
            $p = explode(' ',$l);
            $out['Headers']['Status'] = trim($p[1]);
        } else {
            $p = explode(':',$l);
            $out['Headers'][$p[0]] = trim($p[1]);
        }
    } elseif (!empty($l)) {
        $out['Data'] = $l;
    }
    if (empty($l)){
        $headers = false;
    }
}

这里许多答案的问题在于"'r'n'r'n"可以合法地出现在 html 的正文中,因此您无法确定是否正确拆分了标头。

似乎通过一次调用curl_exec单独存储标头的唯一方法是使用回调,如上面在 https://stackoverflow.com/a/25118032/3326494

然后,要(可靠地(仅获取请求的正文,您需要将Content-Length标头的值作为负起始值传递给substr()

以防万一你不能/不使用CURLOPT_HEADERFUNCTION或其他解决方案;

$nextCheck = function($body) {
    return ($body && strpos($body, 'HTTP/') === 0);
};
[$headers, $body] = explode("'r'n'r'n", $result, 2);
if ($nextCheck($body)) {
    do {
        [$headers, $body] = explode("'r'n'r'n", $body, 2);
    } while ($nextCheck($body));
}

更好的方法是使用详细的 CURL 响应,该响应可以通过管道传输到临时流。然后,您可以在响应中搜索标头名称。这可能需要一些调整,但它对我有用:

class genericCURL {
    /**
     * NB this is designed for getting data, or for posting JSON data
     */
    public function request($url, $method = 'GET', $data = array()) {
        $ch = curl_init();
        
        if($method == 'POST') {
            
            curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
            curl_setopt($ch, CURLOPT_POSTFIELDS, $string = json_encode($data));
            
        }
        
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_VERBOSE, true);
        
        //open a temporary stream to output the curl log, which would normally got to STDERR
        $err = fopen("php://temp", "w+");
        curl_setopt($ch, CURLOPT_STDERR, $err);
        
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        $server_output = curl_exec ($ch);
        
        //rewind the temp stream and put it into a string   
        rewind($err);
        $this->curl_log = stream_get_contents($err);
        
        curl_close($ch);
        fclose($err);
    
        return $server_output;
        
    }
    
    /**
     * use the curl log to get a header value
     */
    public function getReturnHeaderValue($header) {
        $log = explode("'n", str_replace("'r'n", "'n", $this->curl_log));
        foreach($log as $line) {
            //is the requested header there
            if(stripos($line, '< ' . $header . ':') !== false) {
                $value = trim(substr($line, strlen($header) + 3));
                return $value;
            }
        }
        //still here implies not found so return false
        return false;
        
    }
}

杰弗里斯的改进答案:

我无法获得带有$headerSize = curl_getinfo($this->curlHandler, CURLINFO_HEADER_SIZE);标题的正确长度 - 我必须自己计算标题大小。

此外,还进行了一些改进以提高可读性。

       $headerSize = 0;
        $headers['status'] = '';
        curl_setopt_array($this->curlHandler, [
            CURLOPT_URL => $yourURL,
            CURLOPT_POST => 0,
            CURLOPT_HEADER => 1,
            // this function is called by curl for each header received
            // source: https://stackoverflow.com/a/41135574/8398149 and improved
            CURLOPT_HEADERFUNCTION =>
                function ($curl, $header) use (&$headers, &$headerSize) {
                    $lenghtCurrentLine = strlen($header);
                    $headerSize += $lenghtCurrentLine;
                    $header = explode(':', $header, 2);
                    if (count($header) > 1) { // store only valid headers
                        $headers[strtolower(trim($header[0]))] = trim($header[1]);
                    } elseif (substr($header[0], 0, 8) === 'HTTP/1.1') {
                        // get status code
                        $headers['status'] = intval(substr($header[0], 9, 3));
                    }
                    return $lenghtCurrentLine;
                },
        ]);
        $fullResult = curl_exec($this->curlHandler);

返回带有引用参数的响应标头:

<?php
$data=array('device_token'=>'5641c5b10751c49c07ceb4',
            'content'=>'测试测试test'
           );
$rtn=curl_to_host('POST', 'http://test.com/send_by_device_token', array(), $data, $resp_headers);
echo $rtn;
var_export($resp_headers);
function curl_to_host($method, $url, $headers, $data, &$resp_headers)
         {$ch=curl_init($url);
          curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $GLOBALS['POST_TO_HOST.LINE_TIMEOUT']?$GLOBALS['POST_TO_HOST.LINE_TIMEOUT']:5);
          curl_setopt($ch, CURLOPT_TIMEOUT, $GLOBALS['POST_TO_HOST.TOTAL_TIMEOUT']?$GLOBALS['POST_TO_HOST.TOTAL_TIMEOUT']:20);
          curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
          curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
          curl_setopt($ch, CURLOPT_HEADER, 1);
          if ($method=='POST')
             {curl_setopt($ch, CURLOPT_POST, true);
              curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($data));
             }
          foreach ($headers as $k=>$v)
                  {$headers[$k]=str_replace(' ', '-', ucwords(strtolower(str_replace('_', ' ', $k)))).': '.$v;
                  }
          curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
          $rtn=curl_exec($ch);
          curl_close($ch);
          $rtn=explode("'r'n'r'nHTTP/", $rtn, 2);    //to deal with "HTTP/1.1 100 Continue'r'n'r'nHTTP/1.1 200 OK...'r'n'r'n..." header
          $rtn=(count($rtn)>1 ? 'HTTP/' : '').array_pop($rtn);
          list($str_resp_headers, $rtn)=explode("'r'n'r'n", $rtn, 2);
          $str_resp_headers=explode("'r'n", $str_resp_headers);
          array_shift($str_resp_headers);    //get rid of "HTTP/1.1 200 OK"
          $resp_headers=array();
          foreach ($str_resp_headers as $k=>$v)
                  {$v=explode(': ', $v, 2);
                   $resp_headers[$v[0]]=$v[1];
                  }
          return $rtn;
         }
?>

如果您使用的是 GET,请尝试以下操作:

$curl = curl_init($url);
curl_setopt_array($curl, array(
    CURLOPT_URL => $url,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_ENCODING => "",
    CURLOPT_MAXREDIRS => 10,
    CURLOPT_TIMEOUT => 30,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "GET",
    CURLOPT_HTTPHEADER => array(
        "Cache-Control: no-cache"
    ),
));
$response = curl_exec($curl);
curl_close($curl);

如果你真的不需要使用 curl;

$body = file_get_contents('http://example.com');
var_export($http_response_header);
var_export($body);

哪些输出

array (
  0 => 'HTTP/1.0 200 OK',
  1 => 'Accept-Ranges: bytes',
  2 => 'Cache-Control: max-age=604800',
  3 => 'Content-Type: text/html',
  4 => 'Date: Tue, 24 Feb 2015 20:37:13 GMT',
  5 => 'Etag: "359670651"',
  6 => 'Expires: Tue, 03 Mar 2015 20:37:13 GMT',
  7 => 'Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT',
  8 => 'Server: ECS (cpm/F9D5)',
  9 => 'X-Cache: HIT',
  10 => 'x-ec-custom-error: 1',
  11 => 'Content-Length: 1270',
  12 => 'Connection: close',
)'<!doctype html>
<html>
<head>
    <title>Example Domain</title>...

见 http://php.net/manual/en/reserved.variables.httpresponseheader.php