为什么我不能从这样的某些网站下载文件


Why can I not download files from some sites like this?

这是我的php源代码:

<?php
    $path = '/images/one.jpg';
    $imm = 'http://www.allamoda.eu/wp-content/uploads/2012/05/calzedonia_290x435.jpg';
    if( $content = file_get_contents($imm) ){   
    file_put_contents($path, $content);   
    echo "Yes";
    }else{
    echo "No";
    }
?>

我收到此错误:

Warning: file_get_contents(http://www.allamoda.eu/wp-content/uploads/2012/05/calzedonia_290x435.jpg) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden in /opt/lampp/htdocs/test/down.php on line 4
No

为什么?

服务器(尤其是接受用户代理)需要一些标头。使用 file_get_contents()stream_context 参数来提供它们:

<?php
$path = '/images/one.jpg';
$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"Accept-language: en'r'n" .
              "Accept:image/png,image/*;q=0.8,*/*;q=0.5 'r'n".
              "Host: www.allamoda.eu'r'n" .
              "User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:12.0) Gecko/20100101 Firefox/12.0'r'n"
  )
);
    $context = stream_context_create($opts);
    $imm = 'http://www.allamoda.eu/wp-content/uploads/2012/05/calzedonia_290x435.jpg';
    if( $content = file_get_contents($imm,false,$context) ){   
    file_put_contents($path, $content);   
    echo "Yes";
    }else{
    echo "No";
    }
?>

服务器 allamoda.eu 显示(HTTP 403),您不得下载此文件。

代码没有问题。服务器根本不允许您(要么您对它有太多请求,要么它只是阻止所有脚本抓取它)。

您不能直接打开文件。但是您可以尝试使用套接字获取其内容:

function getRemoteFile($url)
{
   // get the host name and url path
   $parsedUrl = parse_url($url);
   $host = $parsedUrl['host'];
   if (isset($parsedUrl['path'])) {
      $path = $parsedUrl['path'];
   } else {
      // the url is pointing to the host like http://www.mysite.com
      $path = '/';
   }
   if (isset($parsedUrl['query'])) {
      $path .= '?' . $parsedUrl['query'];
   } 
   if (isset($parsedUrl['port'])) {
      $port = $parsedUrl['port'];
   } else {
      // most sites use port 80
      $port = '80';
   }
   $timeout = 10;
   $response = '';
   // connect to the remote server 
   $fp = @fsockopen($host, '80', $errno, $errstr, $timeout );
   if( !$fp ) { 
      echo "Cannot retrieve $url";
   } else {
      // send the necessary headers to get the file 
      fputs($fp, "GET $path HTTP/1.0'r'n" .
                 "Host: $host'r'n" .
                 "User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3'r'n" .
                 "Accept: */*'r'n" .
                 "Accept-Language: en-us,en;q=0.5'r'n" .
                 "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7'r'n" .
                 "Keep-Alive: 300'r'n" .
                 "Connection: keep-alive'r'n" .
                 "Referer: http://$host'r'n'r'n");
      // retrieve the response from the remote server 
      while ( $line = fread( $fp, 4096 ) ) { 
         $response .= $line;
      }
      fclose( $fp );
      // strip the headers
      $pos      = strpos($response, "'r'n'r'n");
      $response = substr($response, $pos + 4);
   }
   // return the file content 
   return $response;
}

例:

$content = getRemoteFile('http://www.allamoda.eu/wp-content/uploads/2012/05/calzedonia_290x435.jpg');