我有一个字符串,例如:
This is text outside 'r 'n of pre tags
<pre class="myclass"> Text inside 'r 'n pre tags</pre>
This is text 'r 'n 'r'n outside of pre tags
谁能帮我如何替换和删除'r 'n,但只在<pre>
标签之外,(<pre class="myclass"></pre>
的内容不会被替换)?如何用php正则表达式和preg_replace()或其他方式做到这一点?
我有文本在var $text = 'text<pre class="myclass">text</pre>text';
谢谢你的帮助
更新:感谢所有的回复,对我有帮助,我会考虑DOM,我已经尝试过preg_split(),似乎它适用于我需要的,也许会对某人有帮助-取代<pre class="myclass"></pre>
标签外的'r'n:
function ReplaceOutsidePreTags($text) {
$parts = preg_split('/('<pre class="myclass"'>.+?'<'/pre'>)/s',$text,-1,PREG_SPLIT_DELIM_CAPTURE);
$text_new = '';
foreach ($parts as $key=>$value) {
if (preg_match('[<pre class="myclass">|</pre>]',$value) == true) {
$text_new .= $value;
} else {
$text_new .= str_replace(array("''r''n","''n","''r"),array("","",""), $value);
}
}
return $text_new;
}
$text = 'this is text'r'n'r'r'n'n outside pre tag'r'n
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside'r'n pre tag'r'n
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside'r'n pre tag'r'n
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside pre tag'r'n';
$text_new = $this->ReplaceOutsidePreTags($text);
echo $text_new;
结果> this is text outside pre tag
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside pre tag
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside pre tag
<pre class="myclass">graphics,'r'n'r'nprogramming </pre>
this is text outside pre tag
通用"替换内容,但不能在其他内容内"解决方案:
$out = preg_replace("(<pre(?:'s+'w+(?:='w+|'"[^'"]+'"|'[^']+')?)*>.*?</pre>(*SKIP)(*FAIL)"
."|'r|'n)is", "", $in);
匹配<pre>
标记(带有属性,可以是布尔、无引号、单引号或双引号,因为HTML没有反斜杠转义使问题复杂化),然后跳过并失败。然后匹配换行符并用空字符串替换它们。
作为一个更一般的规则,但是,考虑查看dom解析系统,如DOMDocument。遍历节点,忽略<pre>
标记并从剩余的文本节点中删除换行符。
我实际上使用了一个类似于上面的正则表达式,以便在重要的地方保留空白,并从其他地方删除它,但我使用<!-- WSP_BEGIN --> ... <!-- WSP_END -->
标记来绕过HTML解析的丑陋-因为用户提供的内容是HTML转义的,它不会与注释冲突,所以没有问题。
编辑:作为参考,这是我使用的代码,它通过剥离不必要的空白,每天为我节省了兆字节到千兆字节的带宽。我将其称为"预压缩空白":
$c = preg_replace_callback(
"(<!-- WSP_BEGIN -->(.*?)<!-- WSP_END -->|'r|'n|'t)",
function($m) {
if( $m[1]) return $m[1]; // effectively strips markers
else return " "; // condense whitespace
},
$c
);
你可以在php中不使用正则表达式:
//we need the string we want to fix, and the 2 limits of the substring we don't want to edit.
function get_string($string, $start, $end){
//split until '<pre class="myclass">'
$parts = explode($start,$string);
//split the remaining part until </pre>
$parts1 = explode($end,$parts[1]);
//replace the 2 parts and build an array with the new strings
$parts[0] = str_replace(array("'n","'r"),array("",""),$parts[0]);
$parts[1] = $parts1[0];
$parts[2] = str_replace(array("'n","'r"),array("",""),$parts1[1]);
return implode(" ", $parts);
}
$fullstring = 'This is text outside 'r 'n of pre tags
<pre class="myclass"> Text inside 'r 'n pre tags</pre>
This is text 'r 'n 'r'n outside of pre tags';
$replaced = get_string($fullstring, '<pre class="myclass">', '</pre>');