带有 php dom 的 html 中的错误解析器表


Error parser table in html with php dom

当我尝试使用 php 从 html 中的表中获取数据时,我遇到了此错误:

"message":"DOMDocument::loadHTML(): Misplaced DOCTYPE declaration in Entity"

PHP文件代码为:

$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $loginUrl);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);  
curl_setopt ($ch, CURLOPT_POSTFIELDS, $post_data); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 
if (!$result) { 
        $http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE); 
        curl_close($ch); // make sure we closeany current curl sessions 
        die($http_code.' Unable to connect to server. Please come back later.'); 
    }              
curl_close($ch);   

/*** a new dom object ***/ 
    $dom = new DOMDocument; 
    /*** load the html into the object ***/ 
    $dom->loadHTML($result); 
    /*** discard white space ***/ 
    $dom->preserveWhiteSpace = false; 
    /*** the table by its tag name ***/ 
    $tables = $dom->getElementsByTagName('table'); 
    /*** get all rows from the table ***/ 
    $rows = $tables->item(0)->getElementsByTagName('tr'); 
    /*** loop over the table rows ***/ 

我认为HTML页面不是很完美,但我无法更改它。所以我也可以使用 DOM 来获取数据?

你必须使用:

libxml_clear_errors();libxml_use_internal_errors($errors);