PHP正则表达式:获取一个字符串列表,搜索一些内容,用链接替换列表中的任何匹配项


PHP regex: take a string list, search some content, replace any matches from list with a link

我有一个很大的术语列表,比如这样(大约有1600个条目,可能有2000个单词):http://pastebin.com/6XnWBJwM

我想在我的$content中搜索此列表中的术语,并将找到的任何术语替换为以下格式的链接:<a href="/glossary/firstinitial/term">term</a>,如(术语:腹部)<a href="/glossary/a/abdomen">abdomen</a>

做这件事最有效的方法是什么?

根据这个线程,我一直在使用preg_replace_callback,但无法使其正常工作——它目前正在将内容中的每个单词链接到"/"!我的正则表达式很差!

提前感谢,

// the list of words
$words = explode("|",$arrayOfWords);
// iterate the array
foreach($words as $c=>$v)
 // replace the word in the link with the item of the array
 $line = preg_replace("|<a'w+>(.*)</a>|Usi",$v,$string)

创建注册表并解析它的方法太多了……都是有效的。

如果您试图将腹部更改为<a href="/glossary/a/abdomen">abdomen</a>,这里有一个建议:

$terms = 'abdomen|etc|parental care';
// this is the string of the terms separated by pipes
$terms = explode('|',$terms);
// split terms into an array (aka $terms)
foreach ($terms as $key => $value) {
    $terms[$key] = preg_replace('/'s's*/',' ',strtolower($value));
}
// change each into lowercase and normalize spaces
$str = 'Here''s some example sentence using abdomen. Abdomen is a funny word and parental care is important.';
foreach ($terms as $term) {
// this will loop all the terms so it may take a while
// this looks like a requirement because you have multi-word terms in your list
    $str = preg_replace('/'b('.$term.')'b/i', '<a href="/glossary/'.$term{0}.'/'.str_replace(' ','%20',$term).'">$1</a>', $str);
    // regardless of case the link is assigned the lowercase version of the term.
    // spaces are replaced by %20's
    // --------------------
    // ------- EDIT -------
    // --------------------
    //   added 'b's around the term in regex to prevent, e.g.
    //   'etc' in 'ketchup' from being caught.
}

编辑:检查代码中的最后一条注释。