我一直在尝试使用正则表达式来匹配和替换HTML:的一部分上出现的关键字
- 我想匹配
keyword
和<strong>keyword</strong>
- 但是CCD_ 3和CCD_
我只对第一行的keyword
感兴趣。
我想要这样做的原因是用<a href="dictionary.php?k=keyword">keyword</s>
替换keyword
,但前提是keyword
还不在<a>
标记中。
任何帮助都将不胜感激!
$str = preg_replace('~Moses(?!(?>[^<]*(?:<(?!/?a'b)[^<]*)*)</a>)~i',
'<a href="novo-mega-link.php">$0</a>', $str);
负前瞻内的表达式与下一个关闭的</a>
标记匹配,但前提是没有首先看到打开的<a>
标记。如果成功,则意味着单词Moses
在锚元素内,因此前瞻性失败,并且不发生匹配。
这是一个演示。
我通过做到了我想要的(不使用Regex(
- 解析字符串中的每个字符
- 删除所有
<a>
标记(将它们复制到临时数组并在字符串上保留占位符( str_replace
新字符串以替换所有关键字- 通过其原始
<a>
标记重新填充占位符
这是我使用的代码,以防其他人需要:
$str = <<<STRA
Moses supposes his toeses are roses,
but <a href="original-moses1.html">Moses</a> supposes erroneously;
for nobody's toeses are posies of roses,
as Moses supposes his toeses to be.
Ganda <span class="cenas"><a href="original-moses2.html" target="_blank">Moses</a></span>!
STRA;
$arr1 = str_split($str);
$arr_links = array();
$phrase_holder = '';
$current_a = 0;
$goto_arr_links = false;
$close_a = false;
foreach($arr1 as $k => $v)
{
if ($close_a == true)
{
if ($v == '>') {
$close_a = false;
}
continue;
}
if ($goto_arr_links == true)
{
$arr_links[$current_a] .= $v;
}
if ($v == '<' && $arr1[$k+1] == 'a') { /* <a */
// keep collecting every char until </a>
$arr_links[$current_a] .= $v;
$goto_arr_links = true;
} elseif ($v == '<' && $arr1[$k+1] == '/' && $arr1[$k+2] == 'a' && $arr1[$k+3] == '>' ) { /* </a> */
$arr_links[$current_a] .= "/a>";
$goto_arr_links = false;
$close_a = true;
$phrase_holder .= "{%$current_a%}"; /* put a parameter holder on the phrase */
$current_a++;
}
elseif ($goto_arr_links == false) {
$phrase_holder .= $v;
}
}
echo "Links Array:'n";
print_r($arr_links);
echo "'n'n'nPhrase Holder:'n";
echo $phrase_holder;
echo "'n'n'n(pre) Final Phrase (with my keyword replaced):'n";
$final_phrase = str_replace("Moses", "<a href='"novo-mega-link.php'">Moses</a>", $phrase_holder);
echo $final_phrase;
echo "'n'n'nFinal Phrase:'n";
foreach($arr_links as $k => $v)
{
$final_phrase = str_replace("{%$k%}", $v, $final_phrase);
}
echo $final_phrase;
输出:
链接阵列:
Array
(
[0] => <a href="original-moses1.html">Moses</a>
[1] => <a href="original-moses2.html" target="_blank">Moses</a>
)
短语持有者:
Moses supposes his toeses are roses,
but {%0%} supposes erroneously;
for nobody's toeses are posies of roses,
as Moses supposes his toeses to be.
Ganda <span class="cenas">{%1%}</span>!
(pre(词尾短语(替换了我的关键字(:
<a href="novo-mega-link.php">Moses</a> supposes his toeses are roses,
but {%0%} supposes erroneously;
for nobody's toeses are posies of roses,
as <a href="novo-mega-link.php">Moses</a> supposes his toeses to be.
Ganda <span class="cenas">{%1%}</span>!
最后一句话:
<a href="novo-mega-link.php">Moses</a> supposes his toeses are roses,
but <a href="original-moses1.html">Moses</a> supposes erroneously;
for nobody's toeses are posies of roses,
as <a href="novo-mega-link.php">Moses</a> supposes his toeses to be.
Ganda <span class="cenas"><a href="original-moses2.html" target="_blank">Moses</a></span>!
$lines = explode( "'n", $content );
$lines[0] = stri_replace( "keyword", "replacement", $lines[0] );
$content = implode( "'n", $lines );
或者如果您明确希望使用正则表达式
$lines = explode( "'n", $content );
$lines[0] = preg_replace( "/keyword/i", "replacement", $lines[0] );
$content = implode( "'n", $lines );
考虑使用HTML解析库,而不是像simplehtmldom这样的正则表达式。您可以使用它来更新特定HTML标记的内容(因此,忽略那些您不想更改的标记(。那么您就不必使用正则表达式;只要在过滤了适当的标签后使用str_replace
这样的函数即可。