我有麻烦弄清楚如何读取多个子标签,其中标签名称相同(如div),当我想通过属性读取它。
所以我的html片段看起来像这样:<div>....</div>
<div>....</div>
<div class = 'iwantthisone'>
<h4>value</h4>
<div class ='ilikethistoo'>
<span>another value</span>
</div>
</div>
所以在这个例子中,我试图得到h4的内容和span的内容为每个实例,其中div类显示。
我的相关PHP是这样的:
$doc = new DOMDocument();
@$doc->loadHTMLFile($path);
$body = $doc->getElementsByTagName('body');
$char = $body->item(0)->getElementsByTagName('div');
foreach ($char as $c) {
$test = $c->getAttribute('class');
if ((strpos($test,'iwantthisone') !== false) AND strpos($test,'interaction') == false)) {
$tree = $c->getElementsByTagName('h4');
$value = $tree->item(0)->nodeValue;
}
}
我知道这段代码可以找到类,但我不太明白我是如何告诉它查看它下面的树的。
Xpath示例中,class属性是一个令牌列表(它可以包含多个类名),因此匹配稍微复杂一些:
$html = <<<'HTML'
<div>....</div>
<div>....</div>
<div class = 'iwantthisone'>
<h4>value</h4>
<div class ='ilikethistoo'>
<span>another value</span>
</div>
</div>
HTML;
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
$expression = '//div[
contains(concat(" ", normalize-space(@class), " "), " iwantthisone ") or
contains(concat(" ", normalize-space(@class), " "), " ilikethistoo ")
]';
foreach ($xpath->evaluate($expression) as $node) {
var_dump($node->localName, $node->getAttribute('class'));
}
输出:string(3) "div"
string(12) "iwantthisone"
string(3) "div"
string(12) "ilikethistoo"
您可以使用正则表达式查找类名。像这样…
$doc = new DOMDocument();
@$doc->loadHTMLFile($path);
$body = $doc->getElementsByTagName('body');
$char = $body->item(0)->getElementsByTagName('div');
foreach ($char as $c) {
$test = $c->getAttribute('class');
if (preg_match('/iwantthisone/i',$test)) {
$tree = $c->getElementsByTagName('h4');
$value = $tree->item(0)->nodeValue;
}else if(preg_match('/ilikethistoo/i',$test)){
//do something else...
}