用php从html页面中提取图像url


extract image url from html page with php

如何使用php从该链接中提取帖子图像?

我读到我无法使用regex。

http://www.huffingtonpost.it/2013/07/03/stupri-piazza-tahrir-durante-proteste-anti-morsi_n_3538921.html?utm_hp_ref=italy

非常感谢。

$content=file_get_contents($url);
if (preg_match("/<img.*src='"(.*)'".*class='".*pinit'".*>/", $content, $matches)) 
{
echo "Match was found <br />";
echo $matches[0];
}

$matches[0]将打印整个图像标记。如果你只想提取URL,那么你可以使用$matches[1]来获得相同的:)

您可以/必须使用DOM解析html,以下是您的案例示例:

$curlResource = curl_init('http://www.huffingtonpost.it/2013/07/03/stupri-piazza-tahrir-durante-proteste-anti-morsi_n_3538921.html?utm_hp_ref=italy');
curl_setopt($curlResource, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curlResource, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curlResource, CURLOPT_AUTOREFERER, true);
$page = curl_exec($curlResource);
curl_close($curlResource);

$domDocument = new DOMDocument();
$domDocument->loadHTML($page);
$xpath = new DOMXPath($domDocument);
$urlXpath = $xpath->query("//img[@id='img_caption_3538921']/@src");
$url = $urlXpath->item(0)->nodeValue;
echo $url;

慢慢来,学习一些DOM和XPATH,这是值得的。

试试这个。。。

$content=file_get_contents($url);
if (preg_match("/src=['"''][^'''']+['"'']/", $content, $matches)) 
{
    echo "Match was found <br />";
    echo $matches[0];
}