我有一个html部分要通过简单的domparser检索。
这是我的HTML代码。
<div class="descriptio">
<div class="sptr"><h4>Directors</h4><a href="">Jhon 1</a>, <a href="">Jhon 2</a>, <a href="">Jhon 3</a></div>
<div class="sptr"><h4>Writers</h4><a href="">Doe 1</a>, <a href="">Doe 2</a>, <a href="">Doe 3</a></div>
<div class="sptr"><h4>Stars</h4><a href="">Ann 1</a>, <a href="">Ann 2</a>, <a href="">Ann 3</a></div>
</div>
我想在"sptr"类中获得不同的"descriptio"类值。我想要找回的东西。
董事:Jhon 1,Jhon 2,Jhonn 3
作者:Doe 1,Doe 2,Doe 3
明星:安1,安2,安3
我试过用这个代码,但它是错误的
<?PHP
$directors = '';
foreach ($html_page_url->find('div.description div.sptr') as $val)
{
$directors.= $val.',';
}
?>
我该如何解决这个问题?
使用DOMDocument和DomXPath:
<?php
$html = '<div class="descriptio">
<div class="sptr"><h4>Directors</h4><a href="">Jhon 1</a>, <a href="">Jhon 2</a>, <a href="">Jhon 3</a></div>
<div class="sptr"><h4>Writers</h4><a href="">Doe 1</a>, <a href="">Doe 2</a>, <a href="">Doe 3</a></div>
<div class="sptr"><h4>Stars</h4><a href="">Ann 1</a>, <a href="">Ann 2</a>, <a href="">Ann 3</a></div>
</div>';
$doc = new DOMDocument();
$doc->loadXML($html);
$finder = new DomXPath($doc);
$classname="descriptio";
$divDescriptio = $finder->query("//*[contains(@class, '$classname')]");
$i = 0;
$row = [];
foreach ($divDescriptio[0]->getElementsByTagName('div') as $element)
{
$row[$i]['title'] = $element->getElementsByTagName('h4')[0]->nodeValue;
$names = $element->getElementsByTagName('a');
foreach($names as $name)
{
$row[$i][] = $name->nodeValue;
}
$i++;
}
var_dump($row);
?>
输出:
array(3) {
[0]=>
array(4) {
["title"]=>
string(9) "Directors"
[0]=>
string(6) "Jhon 1"
[1]=>
string(6) "Jhon 2"
[2]=>
string(6) "Jhon 3"
}
[1]=>
array(4) {
["title"]=>
string(7) "Writers"
[0]=>
string(5) "Doe 1"
[1]=>
string(5) "Doe 2"
[2]=>
string(5) "Doe 3"
}
[2]=>
array(4) {
["title"]=>
string(5) "Stars"
[0]=>
string(5) "Ann 1"
[1]=>
string(5) "Ann 2"
[2]=>
string(5) "Ann 3"
}
}
如果你想把它们保存在3个不同的变量中,你可以这样做:
$html = <<<HTML
<div class="descriptio">
<div class="sptr"><h4>Directors</h4><a href="">Jhon 1</a>, <a href="">Jhon 2</a>, <a href="">Jhon 3</a></div>
<div class="sptr"><h4>Writers</h4><a href="">Doe 1</a>, <a href="">Doe 2</a>, <a href="">Doe 3</a></div>
<div class="sptr"><h4>Stars</h4><a href="">Ann 1</a>, <a href="">Ann 2</a>, <a href="">Ann 3</a></div>
</div>
HTML;
$doc = new DOMDocument();
$doc->loadHTML($html);
// Retrieve each category
$headers = $doc->getElementsByTagName('h4');
foreach ($headers as $header) {
// This will hold the name of each variable
$category = strtolower($header->nodeValue);
// This will become $directors, $writers or $stars
$$category = "{$header->nodeValue}: ";
// Retrieve the links of each category
$links = $header->parentNode->getElementsByTagName('a');
$temp = array();
foreach ($links as $link) {
$temp[] = $link->nodeValue;
}
$$category .= implode(',', $temp);
}
echo "{$directors}'n{$writers}'n{$stars}";
输出:
Directors: Jhon 1,Jhon 2,Jhon 3
Writers: Doe 1,Doe 2,Doe 3
Stars: Ann 1,Ann 2,Ann 3