好了,我一直在写这个小脚本,基本上是从ted.com上抓取一个页面,一切都像我想的那样工作(意思是我可以打印出我感兴趣的所有值),问题是由于某种原因,我在运行scraper时得到这些警告,但我不确定为什么通知/警告发生的值被正确打印出来
PHP Warning: dom_import_simplexml() expects parameter 1 to be object, null given in /var/www/ted/import_ted.php on line 23
PHP Notice: Trying to get property of non-object in /var/www/ted/import_ted.php on line 23
PHP Notice: Undefined offset: 1 in /var/www/ted/import_ted.php on line 25
PHP Notice: Trying to get property of non-object in /var/www/ted/import_ted.php on line 27
这是我的php脚本(我已经注释了警告和通知的行)
<?php
$mysqli = mysqli_connect("localhost", "user", "password", "database");
if (mysqli_connect_errno())
{
echo "Failed to connect to MySQL: " . mysqli_connect_error();
}
$html = file_get_contents('http://www.ted.com/talks/quick-list?sort=date&order=desc');
$doc = new DOMDocument();
$doc->loadHTML($html);
$sxml = simplexml_import_dom($doc);
$rows = $sxml->xpath('//tr');
$description="not_available";
$ted_link="none";
$i=0;
//$stmt = $mysqli->prepare("INSERT INTO `ted` VALUES( ?, ?, ?, ?, ?, ?, ?)");
foreach($rows as $row) {
$video = Array();
$video['pub_date']= $row->td[0];
$video['event'] = $row->td[1];
$sec_temp = explode(":" , dom_import_simplexml($row->td[2])->textContent );//line23
$video['speaker'] = $sec_temp[0];
$video['title'] = $sec_temp[1]; //line 25
$video['duration'] = $row->td[3];
$video['link'] = $row->td[4]->a[2]['href']; //line27
print( "'n line Number: " . $i . "title: " . $video['title']);
print ("link: " .$video['link']);
if($i != 0){
// $stmt->bind_param("sssssss", $video['event'], $video['speaker'], $video['title'], $description, $ted_link, $video['link'], $description, $video['pub_date'] );
// $stmt->execute();
}
$i++;
}
//支撑美元->关闭();
?>
好的,所以就像我说的,Everything打印出我所期望的,包括$video['title']
,它由于某种原因产生未定义的偏移量。问题是,直到我可以使这些变量"对象",我不能绑定它们作为参数mysqli查询。然而,我似乎不知道如何做到这一点?
也是incase相关的这里是一个片段的表行incase这是问题(我不认为它是)
<tr>
<td>Jun 2013</td>
<td>TEDGlobal 2013</td>
<td><a href="/talks/manal_al_sharif_a_saudi_woman_who_dared_to_drive.html">Manalal-Sharif: A Saudi woman who dared to drive</a> </td>
<td>14:16</td>
<td><a href="http://download.ted.com/talks/ManalAlSharif_2013G-light.mp4?apikey=TEDDOWNLOAD">Low</a> | <a href="http://download.ted.com/talks/ManalAlSharif_2013G.mp4?apikey=TEDDOWNLOAD">Regular</a> | <a href="http://download.ted.com/talks/ManalAlSharif_2013G-480p.mp4?apikey=TEDDOWNLOAD">High</a></td>
</tr>
还注意,我已经尝试使用settype($var, "object")绑定之前没有运气(虽然返回true)
无论如何,任何帮助我如何才能得到这个工作将非常感激!
<?php
$html = file_get_contents('http://www.ted.com/talks/quick-list?sort=date&order=desc');
$doc = new DOMDocument();
$doc->loadHTML($html);
$sxml = simplexml_import_dom($doc);
$rows = $sxml->xpath('//tr');
/* print_r($rows);
die(); */
$description="not_available";
$ted_link="none";
$i=0;
//$stmt = $mysqli->prepare("INSERT INTO `ted` VALUES( ?, ?, ?, ?, ?, ?, ?)");
foreach($rows as $row) {
//first object is th not an td
if(isset($row->th))
{
echo $row->th[1]->a;
echo $row->th[2]->a;
echo $row->th[3]->a;
echo $row->th[4];
}else{
$video['pub_date']= $row->td[0];
$video['event'] = $row->td[1];
$sec_temp = explode(":" , $row->td[2]->a);//line23
$video['speaker'] = $sec_temp[0];
$video['title'] = $sec_temp[1]; //line 25
$video['duration'] = $row->td[3];
$video['link'] = $row->td[4]->a[2]['href']; //line27
print( "'n line Number: " . $i . "title: " . $video['title']);
print ("link: " .$video['link']);
if($i != 0){
// $stmt->bind_param("sssssss", $video['event'], $video['speaker'], $video['title'], $description, $ted_link, $video['link'], $description, $video['pub_date'] );
// $stmt->execute();
}
$i++;
}
}