PHP 脚本,用于下载网页的源代码并搜索特定字符串


PHP Script to download a webpage's source code and search for a specific string

我需要帮助来制作完成以下任务的PHP代码:

  1. 访问网站 (www.example.com)
  2. 将其源代码下载到字符串变量中
  3. 在此特定字符串中搜索特定内容,例如

    <div class="news" title="news alert">Click to get news alert</div>

基本上我需要在源代码中搜索title="news alert"

谢谢大家,

你可以

使用 PHP DOM:

$text = file_get_contents('http://example.com/path/to/file.html');
$doc = new DOMDocument('1.0');
$doc->loadHTML($text);
foreach($doc->getElementsByTagName('div') AS $div) {
    $class = $div->getAttribute('class');
    if(strpos($class, 'news') !== FALSE) {
        if($div->getAttribute('title') == 'news alert') {
            echo 'title found';
        }
        else {
            echo 'title not found';
        }
    }
}

或者可能是尝试模拟jQuery服务器端的查询路径:

$text = file_get_contents('http://example.com/path/to/file.html');
if(qp($text)->find('div.news[title="news alert"]')->is('*')) {
    echo('title found');
}
else {
    echo('title found');
}

您可以使用 DOMXPath 来查找它:

$dcmnt = new DOMDocument(); $dcmnt->loadHTML( $cntnt );
$xpath = new DOMXPath( $dcmnt );
$match = $xpath->query("//div[@title='news alert']");
echo $match->length ? "Found" : "Not Found" ;

演示:http://codepad.org/CLdE8XCQ

这很简单:

$html = file_get_contents('http://site.com/page.html');
if (strpos($html,'title="news alert"')!==false)
 echo 'title found';
$page = file_get_contents('http://www.example.com/');
if(strpos($page, "title='"news alert'"")!==false){
    echo 'title found';
}
$url = 'http://www.example.com/';
$page = file_get_contents($url);
if(strpos($page, 'title="news alert"') !==false || strpos($page, 'title=''news alert''') !==false)
{
    echo 'website with news alert found';
}
else
{
    echo 'website not found';
}