regex-php在表中查找数据


regex php find data in a table

我正试图从欧洲pv_gis的curl表中获得太阳辐射的年总值和其他值。

我得到的表格是:

<table class=data_table border="1" width="300" >
<tr> <td> Jan </td><td align="right">2.27</td><td align="right">70.3</td><td align="right">2.86</td><td align="right">88.5</td></tr>
<tr> <td> Feb </td><td align="right">2.79</td><td align="right">78.0</td><td align="right">3.56</td><td align="right">99.7</td></tr>
<tr> <td> Mar </td><td align="right">3.59</td><td align="right">111</td><td align="right">4.74</td><td align="right">147</td></tr>
<tr> <td> Apr </td><td align="right">4.23</td><td align="right">127</td><td align="right">5.68</td><td align="right">171</td></tr>
<tr> <td> May </td><td align="right">4.46</td><td align="right">138</td><td align="right">6.13</td><td align="right">190</td></tr>
<tr> <td> Jun </td><td align="right">4.53</td><td align="right">136</td><td align="right">6.38</td><td align="right">191</td></tr>
<tr> <td> Jul </td><td align="right">4.74</td><td align="right">147</td><td align="right">6.70</td><td align="right">208</td></tr>
<tr> <td> Aug </td><td align="right">4.59</td><td align="right">142</td><td align="right">6.53</td><td align="right">202</td></tr>
<tr> <td> Sep </td><td align="right">4.32</td><td align="right">130</td><td align="right">5.96</td><td align="right">179</td></tr>
<tr> <td> Oct </td><td align="right">3.63</td><td align="right">113</td><td align="right">4.87</td><td align="right">151</td></tr>
<tr> <td> Nov </td><td align="right">2.64</td><td align="right">79.1</td><td align="right">3.41</td><td align="right">102</td></tr>
<tr> <td> Dec </td><td align="right">2.15</td><td align="right">66.5</td><td align="right">2.72</td><td align="right">84.3</td></tr>
<tr><td colspan=5> </td></tr>
<tr><td><b> Yearly average </b></td><td align="right"><b>3.67 </b></td><td align="right"><b>111 </b></td></td><td align="right"><b>4.97 </b></td><td align="right"><b>151 </b></td></tr>
<tr><td><b>Total for year</b></td><td align="right" colspan=2 ><b>  1340 </b> </td> <td align="right" colspan=2 ><b>  1810 </b> </td> </tr>
</table>

正如您所看到的,总值包含在该表的最后一个标记中。具体来说,年度总值在第二个标签中。

现在,我尝试使用txt2reg工具来构建一个正则表达式,但获得了成功,因为我不知道如何针对上面提到的表的最后一行。

通过删除所有TR和TD,我得到了无限长的数字串,但在这一点上,数字会混淆。

你们有什么建议吗?

非常感谢。

编辑

我做了以下操作,但出现了一个错误。错误为:

Catchable fatal error: Argument 1 passed to DOMXPath::__construct() must be an instance of DOMDocument, instance of DOMElement given in C:'Users'test'www2'test_pvgis.php on line 49

代码是:

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);
$table = $doc->getElementsByTagName('table')->item(1); 
print_r($table);

$xpath = new DOMXpath($table);
$lastRow = $xpath->query("(//tr)[last()]");
// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow[0]);
// you can also store the values for later use
foreach($cells as $key=>$cell){
    //we are ignoring the first key, since it holds the "Total for year" bit
    if ($key != 0){
        $store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
    }
}
print_r($store);

错误位于此处:$xpath=新的DOMXpath($table);但我想知道为什么。有线索吗?

编辑

假设您有更多的表,并且第一个表是相关的表
您需要将一个DOMDocument实例传递给DOMXpath构造函数
因此,$xpath = new DOMXpath($doc);将使用$doc
当您为最后一行query时,您将$table元素作为第二个参数传递


以下是使用DOMDocumentDOMXpath 的示例

// start edit
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($varResponse);
$table = $doc->getElementsByTagName('table')->item(1); 
print_r($table);
$xpath = new DOMXpath($doc);
$lastRow = $xpath->query("(./tr)[last()]",$table);
// end edit
// look for td elements inside the last row we isolated above
// path for td elements is relative
$cells = $xpath->query('./td',$lastRow->item(0)); // fixed 'Cannot use object of type DOMNodeList as array i'
// you can also store the values for later use
foreach($cells as $key=>$cell){
    //we are ignoring the first key, since it holds the "Total for year" bit
    if ($key != 0){
        $store[] = trim($cell->nodeValue); // trim out the leading and trailing spaces
    }
}
print_r($store);
/*
ouputs
Array
(
    [0] => 1340
    [1] => 1810
)
*/