正在爬网以从其他网站获取特定内容 - Crawling to get specific content from another website

Crawling to get specific content from another website

本文关键字：获取网站其他 | 更新日期: 2024-02-28

我有一个搜索框。。。。我想在框中输入一个特定的值。。。。。我有一些网站的网址，上面有所需的数据。例如

http://www.zafa.com.pk/tablets.html , http://www.zafa.com.pk/injections.html

当我点击搜索按钮时，脚本应该只返回那些内容与输入的搜索值匹配的网站URL。请告诉我该怎么做我尝试了以下代码，但它对我不起作用

注意：我不是在搜索整个网站，我只是在搜索网站的某些页面。

<?php 
  $ch = curl_init(); 
  curl_setopt($ch, CURLOPT_URL, 'http://www.google.com'); 
  curl_setopt($ch, CURLOPT_HEADER, 0); 
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 300);
curl_setopt($ch, CURLOPT_TIMEOUT, 300);
  $data = curl_exec($ch); 
  file_put_contents("text.txt", $data);
  curl_close($ch); 
?>

您可以按照以下方式进行操作：

注意：以下代码适用于单个网站。对于多个网站，您可以使用explode()和foreach

$searc_in = file_get_contents('http://www.zafa.com.pk/tablets.html');
$findme = 'CARDACE';
$pos = strpos($searc_in, $findme);

if ($pos === false) {
    echo "The string '$findme' was not found in the website";
} else {
    echo "The string '$findme' was found in the website";
    echo " and exists at position $pos";
}