DOMDocument::loadHTML(）: warning - htmlParseEntityRef: 在 Ent - DOMDocument::loadHTML(): warning - htmlParseEntityRef: no name in Entity

我发现了几个类似的问题，但到目前为止，没有一个能够帮助我。

我正在尝试在HTML块中输出所有图像的"src"，所以我正在使用DOMDocument().这种方法实际上有效，但我在某些页面上收到警告，我不知道为什么。一些帖子建议暂停警告，但我更愿意找出为什么生成警告。

警告：DOMDocument：：loadHTML（）： htmlParseEntityRef： no name in 实体，行：10

生成错误的post->post_content的一个示例是 -

On Wednesday 21st November specialist rights of way solicitor Jonathan Cheal of Dyne Drewett will be speaking at the Annual Briefing for Rural Practice Surveyors and Agricultural Valuers in Petersfield.
<br>
Jonathan is one of many speakers during the day and he is specifically addressing issues of public rights of way and village greens.
<br>
Other speakers include:-
<br>
<ul>
<li>James Atrrill, Chairman of the Agricultural Valuers Associates of Hants, Wilts and Dorset;</li>
<li>Martin Lowry, Chairman of the RICS Countryside Policies Panel;</li>
<li>Angus Burnett, Director at Martin & Company;</li>
<li>Esther Smith, Partner at Thomas Eggar;</li>
<li>Jeremy Barrell, Barrell Tree Consultancy;</li>
<li>Robin Satow, Chairman of the RICS Surrey Local Association;</li>
<li>James Cooper, Stnsted Oark Foundation;</li>
<li>Fenella Collins, Head of Planning at the CLA; and</li>
<li>Tom Bodley, Partner at Batcheller Monkhouse</li>
</ul>

如果有帮助，我可以发布更多post->post_content包含的示例？

我暂时允许访问开发站点，因此您可以看到一些示例[注意 - 链接不再可访问，因为问题已得到解答] -

错误 - http://test.dynedrewett.com/specialist-solicitor-speaks-at-petersfield-update/
无错误 - http://test.dynedrewett.com/restrictive-covenants-in-employment-contracts/

关于如何解决此问题的任何提示？谢谢。

$dom = new DOMDocument();
$dom->loadHTML(apply_filters('the_content', $post->post_content)); // Have tried stripping all tags but <img>, still generates warning
$nodes = $dom->getElementsByTagName('img');
foreach($nodes as $img) :
    $images[] = $img->getAttribute('src');
endforeach;

这个正确答案来自@lonesomeday的评论。

我最好的猜测是HTML中的某个地方有一个未转义的&符号（&）。这将使解析器认为我们在实体引用中（例如）。 ©当它到达 ; 时，它认为实体已经结束。然后，它意识到它所拥有的内容不符合实体，因此它会发出警告并以纯文本形式返回内容。

正如这里提到的

警告：DOMDocument：：loadHTML（）： htmlParseEntityRef：期望在实体中出现";"，

您可以使用：

libxml_use_internal_errors(true);

请参阅 http://php.net/manual/en/function.libxml-use-internal-errors.php

在任何地方检查 HTML 代码中的"&"字符。由于这种情况，我遇到了这个问题。

HTML

中某处未转义的"&"，并将"&"替换为 &。这是我的解决方案！

 $html = preg_replace('/&(?!amp)/', '&amp;', $html);

它将用"&"

替换单与号，但当前的"&"仍将保持不变。

我没有在

上面发表评论所需的声誉，但是使用htmlspecialchars解决了我在这种情况下的这个问题：

$inputHTML = htmlspecialchars($post->post_content);
$dom = new DOMDocument();
$dom->loadHTML(apply_filters('the_content', $inputHTML)); // Have tried stripping all tags but <img>, still generates warning
$nodes = $dom->getElementsByTagName('img');
foreach($nodes as $img) :
    $images[] = $img->getAttribute('src');
endforeach;

出于我的目的，我也在使用 strip_tags($inputHTML, "<strong><em><br>") ，所以所有图像标签也被去除了 - 我不确定这是否会成为问题。

我最终用正确的方式解决了这个问题，使用 tidy

// Configuration
$config = array(
    'indent'         => true,
    'output-xhtml'   => true,
    'wrap'           => 200);
// Tidy to avoid errors during load html
$tidy = new tidy;
$tidy->parseString($bill->bill_text, $config, 'utf8');
$tidy->cleanRepair();
$domDocument = new DOMDocument();
$domDocument->loadHTML(mb_convert_encoding($tidy, 'HTML-ENTITIES', 'UTF-8'));

对于 laravel，

使用 {{ }}
而不是 {! !!}

我

面对这个问题，我设法解决了它。

我发现我的表标签中有一个错误。还有一个额外的</td>我删除了和宾果游戏。

只需将

字符串中的"&"替换为"and"即可。对所有其他符号执行此操作