如何提取HTML并放入php关联数组


How to extract HTML and put in php associative array?

我有一个php变量,其中包含一个html文档。我试图将li>span和li>strong提取到某种关联数组中。

$html变量中的html是

<ul class="ul-data" xmlns:utils="urn:utils" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <li><span>
          Vehicle make
        </span><strong>CITROEN</strong></li>
  <li><span>
            Year of manufacture
          </span><strong>1997</strong></li>
  <li><span>
          Cylinder capacity (cc)
        </span><strong>1124cc
        </strong></li>
  <li><span>
          Fuel type
        </span><strong>PETROL</strong></li>
  <li><span>
          Vehicle colour
        </span><strong>BLUE</strong></li>
  <li><span>
          Vehicle type approval
        </span><strong>
              Not available
            </strong></li>
</ul>

到目前为止我的代码是

$dom = new DOMDocument();
//as @Larry.Z comments, you forgot to load the $html
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
//assuming there can be more than one "result set" on each page
$results = array();
$result_divs = $xpath->query('//ul[@class="ul-data"]');
foreach ($result_divs as $result_div) {
    $result=array();
    foreach ($result_div->childNodes as $result_item) {
        $content=trim($result_item->textContent);
        if ($content!='') $result[]=$content;
    } 
    $results[]=$result;
}
echo '<pre>';
print_r($results);
echo '</pre>';

它打印出

Array
(
    [0] => Array
        (
            [0] => Vehicle make
        CITROEN
            [1] => Date of first registration
            27 August 1997
            [2] => Year of manufacture
          1997
            [3] => Cylinder capacity (cc)
        1124cc
            [4] => Fuel type
        PETROL
            [5] => Vehicle colour
        BLUE
            [6] => Vehicle type approval
              Not available
        )
)

我如何让它设置像这样的关联数组

[Vehicle make] => CITREON 

问题是,我需要获取li>span作为键,然后将<strong>之间的数据作为值。

由于html只有一个ul,因此不需要外部循环。您可以获取所有li标签,并访问第一个子元素和第二个子元素:

$dom = new DOMDocument();
$dom->loadHTML($html);
$results = array();
foreach ($dom->getElementsByTagName('li') as $li) {        
    $results[$li->childNodes->item(0)->textContent]=$li->childNodes->item(1)->textContent;
}
print_r($results);