我使用regex进行HTML解析,但我需要您的帮助来解析下表:
<table class="resultstable" width="100%" align="center">
<tr>
<th width="10">#</th>
<th width="10"></th>
<th width="100">External Volume</th>
</tr>
<tr class='odd'>
<td align="center">1</td>
<td align="left">
<a href="#" title="http://xyz.com">http://xyz.com</a>
</td>
<td align="right">210,779,783<br />(939,265 / 499,584)</td>
</tr>
<tr class='even'>
<td align="center">2</td>
<td align="left">
<a href="#" title="http://abc.com">http://abc.com</a>
</td>
<td align="right">57,450,834<br />(288,915 / 62,935)</td>
</tr>
</table>
我想获得所有域的卷(在数组或var中),例如
http://xyz.com - 210,779,783
在这种情况下,我应该使用regex还是HTML dom。我不知道如何解析大表,你能帮忙吗,谢谢。
这里有一个XPath示例,正好解析问题中的HTML。
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile("./input.html");
$xpath = new DOMXPath($dom);
$trs = $xpath->query("//table[@class='resultstable'][1]/tr");
foreach ($trs as $tr) {
$tdList = $xpath->query("td[2]/a", $tr);
if ($tdList->length == 0) continue;
$name = $tdList->item(0)->nodeValue;
$tdList = $xpath->query("td[3]", $tr);
$vol = $tdList->item(0)->childNodes->item(0)->nodeValue;
echo "name: {$name}, vol: {$vol}'n";
}
?>