Domdocument saveHTML()添加额外的引号和一些其他url编码字符

Domdocument saveHTML() adding extra quotes and some other url encoded characters


<span style="font-weight:bold;">Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href="">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href=""" target="_blank">Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as <img src="">


$dom=new DOMDocument();
$dom->formatOutput = true;
$imgs = $dom->getElementsByTagName("img");
foreach($imgs as $img){
 $alt = $img->getAttribute('alt');
 if ($alt == ''){
  $k_alt = $this->keyword;    
  $k_alt = $alt;
 $img->setAttribute( 'alt' , $k_alt );
$html_mod = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));
return $html_mod;


<span style='"font-weight:bold;"'>Blender</span> is an Open Source 3D modelling and animation software. 
This is a very popular software among hobbyists.<i>Blender</i> has a vast list of features which include bones and meshing, textures, particle physics etc.
<u>Blender</u> was originally a proprietary software which was eventually made opensource. 
Blender is known to be difficult to learn because its interface is very intimiding to a newbie. 
But on the other hand, <a href="""">Blender</a> is so much customizable that you can actually modify your workspace according to your personal preference. 
Also blender interface has been developed in the OpenGL graphics library, so blender looks all the same on all platforms whether you use Windows, Linux, BSD or even Mac. 
3D is a very interesting field to work with but 3D is somewhat tough to start with. You can <a href=""""" target='"_blank"'>Google</a> for numerous tutorials on Blender. 
There are quite some awesome websites dedicated to blender development, such as 
<img src="""" alt="Blender">

观察img src和锚标记以及span的style属性中的额外引号(单引号和双引号)。


我还想提一下,我使用PHP 5.3.2与Suhosin补丁在Ubuntu 10.04



$filecontent = file_get_contents('file.html');
$doc = new DOMDocument();
$xpath = new DOMXpath($doc);
$xpath->query("//*[id='bg']")[0]->nodeValue = 'asd';
$filecontent = html_entity_decode($doc->saveHTML());
file_put_contents('file.html', $file_contents);

所以你会得到好的正确的html代码在$ filcontent变量没有多余的引号不客气!