根据属性名读取子元素


Reading child elements by attribute name

我有麻烦弄清楚如何读取多个子标签,其中标签名称相同(如div),当我想通过属性读取它。

所以我的html片段看起来像这样:
<div>....</div>
<div>....</div>
<div class = 'iwantthisone'>
    <h4>value</h4>
    <div class ='ilikethistoo'>
        <span>another value</span>
    </div>
</div>

所以在这个例子中,我试图得到h4的内容和span的内容为每个实例,其中div类显示。

我的相关PHP是这样的:

$doc = new DOMDocument();
@$doc->loadHTMLFile($path);
$body = $doc->getElementsByTagName('body');
$char = $body->item(0)->getElementsByTagName('div'); 
    foreach ($char as $c) {
    $test = $c->getAttribute('class');          
        if ((strpos($test,'iwantthisone') !== false) AND strpos($test,'interaction') == false)) {
            $tree = $c->getElementsByTagName('h4');
                $value = $tree->item(0)->nodeValue;        
        }
    }

我知道这段代码可以找到类,但我不太明白我是如何告诉它查看它下面的树的。

Xpath示例中,class属性是一个令牌列表(它可以包含多个类名),因此匹配稍微复杂一些:

$html = <<<'HTML'
<div>....</div>
<div>....</div>
<div class = 'iwantthisone'>
    <h4>value</h4>
    <div class ='ilikethistoo'>
        <span>another value</span>
    </div>
</div>
HTML;
$dom = new DOMDocument();
$dom->loadHtml($html);
$xpath = new DOMXpath($dom);
$expression = '//div[
  contains(concat(" ", normalize-space(@class), " "), " iwantthisone ") or
  contains(concat(" ", normalize-space(@class), " "), " ilikethistoo ")
]';
foreach ($xpath->evaluate($expression) as $node) {
  var_dump($node->localName, $node->getAttribute('class'));
}
输出:

string(3) "div"
string(12) "iwantthisone"
string(3) "div"
string(12) "ilikethistoo"

您可以使用正则表达式查找类名。像这样…

$doc = new DOMDocument();
@$doc->loadHTMLFile($path);
$body = $doc->getElementsByTagName('body');
$char = $body->item(0)->getElementsByTagName('div'); 
foreach ($char as $c) {
    $test = $c->getAttribute('class');          
        if (preg_match('/iwantthisone/i',$test)) {
            $tree = $c->getElementsByTagName('h4');
            $value = $tree->item(0)->nodeValue;        
        }else if(preg_match('/ilikethistoo/i',$test)){
            //do something else...
    }