给定以下HTML:
$content = '<html>
<body>
<div>
<p>During the interim there shall be nourishment supplied</p>
</div>
</body>
</html>';
如何将其修改为以下HTML:
<html>
<body>
<div>
<p>During the <span>interim</span> there shall be nourishment supplied</p>
</div>
</body>
</html>
我需要使用DomDocument来做到这一点。以下是我尝试过的:
$dom = new DomDocument();
$dom->loadHTML($content);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXpath($dom);
$elements = $xpath->query("//*[contains(text(),'interim')]");
if (!is_null($elements)) {
foreach ($elements as $element) {
$text = $element->nodeValue;
$element->nodeValue = str_replace('interim','<span>interim</span>',$text);
}
}
echo $dom->saveHTML();
但是,这会输出文字html实体,因此它在浏览器中呈现如下:
During the <span>interim</span> there shall be nourishment supplied
我想应该使用createElement
和appendChild
方法,而不是直接分配nodeValue
,但我看不出如何在textNode字符串的中间插入一个元素?
Marcus Harrison使用splitText
的答案是一个很好的答案,但它可以简化,需要使用mb_*方法来处理UTF-8输入:
<?php
$html = <<<END
<html>
<meta charset="utf-8">
<body>
<div>
<p>During € the interim there shall be nourishment supplied</p>
</div>
</body>
</html>
END;
$replace = 'interim';
$doc = new DOMDocument;
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$nodes = $xpath->query(sprintf('//text()[contains(., "%s")]', $replace));
foreach ($nodes as $node) {
$start = mb_strpos($node->textContent, $replace);
$end = $start + mb_strlen($replace);
$node->splitText($end); // do this first
$node->splitText($start); // do this last
$newnode = $doc->createElement('span');
$node->parentNode->insertBefore($newnode, $node->nextSibling);
$newnode->appendChild($newnode->nextSibling);
}
$doc->encoding = 'UTF-8';
print $doc->saveHTML($doc->documentElement);
用修改后的元素创建新的DomDocument,并替换旧的DomDocument
foreach ($elements as $element) {
$text = $element->nodeValue;
$el = new DomDocument();
$el->loadHTML('<iframe>'. str_replace('interim','<span>interim</span>',$text) . '</iframe>');
$new = $dom->importNode($el->getElementsByTagName('iframe')->item(0), true);
unset($el);
$element->parentNode->replaceChild($new, $element);
}
为了做到这一点,您必须使用DOMString的splitText接口。它接受一个偏移量,可以通过使用strpos:
来检索。$dom = new DomDocument();
$dom->loadHTML($content);
$dom->preserveWhiteSpace = false;
$xpath = new DOMXpath($dom);
$elements = $xpath->query("//*[contains(text(),'interim')]");
if (!is_null($elements)) {
foreach ($elements as $element) {
$text = $element->childNodes->item(0);
$text->splitText(strpos($text->textContent, "interim"));
$text2 = $element->childNodes->item(1);
$text2->splitText(strpos($text2->textContent, " "));
$element->removeChild($text2);
$span = $dom->createElement("span");
$span->appendChild($dom->createTextNode("interim"));
$element->insertBefore($span, $element->childNodes->item(1));
}
}
echo $dom->saveHTML();
编辑:刚刚测试后,我意识到我没有删除第二个文本节点中的原始"interim"。编辑了这个答案。我还编辑了这段代码,使其与旧版本的PHP尽可能兼容,因为我不运行旧版本的PHP,所以我不可能对其进行测试。