PHP curl请求远程图像永远占用,我可以做些什么来改进我的代码


PHP curl request on remote images taking forever, what can I do to improve my code?

这是我的代码,我无法找出是什么导致了延迟?

要加载远程URL,不需要超过一秒钟的时间,我应该传递一个user_agent吗?

请原谅,如果这是一个愚蠢的问题,我是PHP的新手,是否值得在curl请求中设置一个超时?

<?php
$url = $_GET['url'];
if(!filter_var($url, FILTER_VALIDATE_URL)) {
?>
{"errors":1,"message":"The URL was not valid"}
<?php
    die();
}

$p=parse_url($url);
$baseurl = $p['scheme'] . '://' . $p['host'];
$path_parts = pathinfo($url);
$current_dir = $path_parts['dirname'];

Function check_img($file) {
   $x = @getimagesize($file);
   if ($x) {
   switch ($x['mime']) {
      case "image/gif" || "image/jpeg" || "image/png":
         $response = true;
         break;
      default:
         $response = false;
         break;
   }
} else {
         $response = false;
}
   return $response;    
}

function ranger($url){
    $headers = array(
    "Range: bytes=0-605768"
    );
    $curl = curl_init($url);
    curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
    $data = curl_exec($curl);
    curl_close($curl);
    return $data;
}

function file_get_contents_curl($url)
{
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
$html = file_get_contents_curl($url);
//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
// Get all image tags
    $imageTags = $doc->getElementsByTagName('img');
    $numImages = $doc->getElementsByTagName('img')->length;
//get and display what you need:
$metas = $doc->getElementsByTagName('meta');
for ($i = 0; $i < $metas->length; $i++)
{
    $meta = $metas->item($i);
    if($meta->getAttribute('property') == 'og:image' || $meta->getAttribute('name') == 'og:image')
        $fb_image = $meta->getAttribute('content');
        $fb_image = isset($fb_image) ? $fb_image : ''; 
}

?>
{
    "resource_images": {
        "url" : "<?php echo $url?>",
        "baseurl" : "<?php echo $baseurl?>",
        "fb" : "<?php echo $fb_image?>",
        "images" : [<?php
    $i = 0;
$image_results = array();
$numItems = count($imageTags);
if ($fb_image !== '') {
    $image_results[] = $fb_image;   
}
    foreach($imageTags as $tag) {
            if ($i >= 25) {
                break;
            }
        if (substr($tag->getAttribute('src'),0,4) === 'http') {
            $img = $tag->getAttribute('src');
        } elseif (substr($tag->getAttribute('src'),0,1) === '/') {
            $img = $baseurl . $tag->getAttribute('src');
        } else {
            $img = $current_dir . $tag->getAttribute('src');            
        }
$exists = check_img($img);

if ($exists) {
    $raw = ranger($img);
        $im = imagecreatefromstring($raw);
        $width = imagesx($im);
        $height = imagesy($im);
        if ($width > 300) {
            $image_results[] = str_replace('"', "", $img);
        }
if(++$i < $numItems && ++$i < 25) {
    echo ",";
  }   
     }
}

   $i = 0;
foreach($image_results as $img_url) {
?>
{
                        "url" : "<?php echo str_replace('"', "", $img_url);?>",
                        "count" : <?php echo count($image_results)?>
                    }
<?php   
if(++$i < count($image_results) && $i < 15) {
    echo ",";
  }   
}?>

        ]
    }
}

在开始时使用这个:

set_time_limit(0)

是的,这肯定是卷曲的超时,因为这可能会永远持续下去。在这种情况下,我要做的是精确定位像这样占用大量时间的代码:

<?php 
function microtime_float() {
  list($usec, $sec) = explode(" ", microtime());
  return ((float)$usec + (float)$sec);
}
$time_start = microtime_float(); //this @ the top of ur file
// process some code
// ...    
// show results, this can be anywhere, inside a function, loop etc, 
$time_end = microtime_float();
$time = $time_end - $time_start;
echo "Did it in $time seconds'n . <br>";

我不想对整个脚本计时,而是逐一查找等待的位置。