php-xpath返回整个html


php xpath returning entire html

为什么这会返回整个html文档,而不仅仅是来自包含"H+R+E"的节点的值?

html:示例

<tr class="linesAlt1">
        <td>04:10 PM</td><td style="width:53%;">3055&nbsp;Over</td><td style="width:22%;">3&nbsp;H+R+E&nbsp;&nbsp;+146</td>
    </tr>

我只想得到"3&nbsp;H+R+E&nbsp;&nbsp;+146"。但这会转储所有html。

<?php
$url = 'http://www.pinnaclesports.com/ContestCategory/MLB+Propositions/July+13~2C~+2012/Lines.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
$html = curl_exec($ch);
curl_close($ch);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query("//table/tr/td[contains(., 'H+R+E')]") as $textNode){
  echo $textNode->nodeValue."'n";
}

?>

curl_exec默认打印到STDOUT,这就是您所看到的。换句话说,您没有在$html中捕获任何输出(或打印该循环中的任何内容)。首先,您需要重定向输出:

curl_setopt($ch, CURLOPT_FILE, fopen('php://stdout', 'w'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_URL, $url);
$html = curl_exec($ch); 

在解决了这个问题之后,我查看了提供的URL的来源,但在其中的任何地方都找不到文本H+R+E。它有一个表,但没有包含该内容。你在寻找不存在的东西。

为了证明它现在可以正确地检索文件,请尝试以下完整的示例:

$url = 'http://www.pinnaclesports.com/ContestCategory/MLB+Propositions/July+13~2C~+2012/Lines.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_FILE, fopen('php://stdout', 'w'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_URL, $url);
$html = curl_exec($ch); 
curl_close($ch);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query("//table") as $table){
      echo "[" . $table->nodeValue . "'n";
}

其产生以下输出(省略loadHTML警告):

[客户端ID:密码:

有关设置cURL选项的更多信息:

  • http://php.net/manual/en/function.curl-setopt.php
  • http://php.net/manual/en/function.curl-setopt-array.php