php timeout with file_get_html

本文关键字：get html file timeout with php | 更新日期: 2023-09-27

我一直试图通过使用 PHP 库从 wikia 网站获取一些数据simple_html_dom基本上我所做的是使用 Wikia API 转换为 HTML 渲染并从那里提取数据。提取后，我会将这些数据泵入 mysql 数据库进行保存。我的问题是，通常我会拉取 300 条记录，我会卡在 93 条记录上，file_get_html为 null，这将导致我的 find(( 函数失败。我不确定为什么它停止在 93 条记录，但我尝试了各种解决方案，例如

   ini_set( 'default_socket_timeout', 120 );
   set_time_limit( 120 );

基本上，我必须访问Wikia页面300次才能获得这300条记录。但大多数情况下，我会设法在 file_get_html 变为空之前获得 93 条记录。知道我该如何解决这个问题吗？

我也有测试卷曲，也有同样的问题。

function test($url){
 $ch=curl_init();
 $timeout=5;
 curl_setopt($ch, CURLOPT_URL, $url);
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
 $result=curl_exec($ch);
 curl_close($ch);
 return $result;
 }
 $baseurl = 'http://xxxx.wikia.com/index.php?';
 foreach($resultset_wiki as $name){
  // Create DOM from URL or file
 $options = array("action"=>"render","title"=>$name['name']);
 $baseurl .= http_build_query($options,'','&');
 $html = file_get_html($baseurl);
 if($html === FALSE) {
 echo "issue here";
 }
  // this code for cURL but commented for testing with file_get_html instead
  $a = test($baseurl);
  $html = new simple_html_dom();
  $html->load($a);
    // find div stuff here and mysql data pumping here.
 }

$resultsetwiki是一个数组，其中包含要从wikia获取的标题列表，基本上resultsetwiki数据集在执行搜索之前也会从数据库加载。

实际上我会这种类型的错误

  Call to a member function find() on a non-object in

回答了

我自己的问题，似乎是我正在使用的URL，我已经更改为使用post进行curl以发布操作和标题参数