我有一个很大的术语列表,比如这样(大约有1600个条目,可能有2000个单词):http://pastebin.com/6XnWBJwM
我想在我的$content
中搜索此列表中的术语,并将找到的任何术语替换为以下格式的链接:<a href="/glossary/firstinitial/term">term</a>
,如(术语:腹部)<a href="/glossary/a/abdomen">abdomen</a>
。
做这件事最有效的方法是什么?
根据这个线程,我一直在使用preg_replace_callback
,但无法使其正常工作——它目前正在将内容中的每个单词链接到"/"!我的正则表达式很差!
提前感谢,
// the list of words
$words = explode("|",$arrayOfWords);
// iterate the array
foreach($words as $c=>$v)
// replace the word in the link with the item of the array
$line = preg_replace("|<a'w+>(.*)</a>|Usi",$v,$string)
创建注册表并解析它的方法太多了……都是有效的。
如果您试图将腹部更改为<a href="/glossary/a/abdomen">abdomen</a>
,这里有一个建议:
$terms = 'abdomen|etc|parental care';
// this is the string of the terms separated by pipes
$terms = explode('|',$terms);
// split terms into an array (aka $terms)
foreach ($terms as $key => $value) {
$terms[$key] = preg_replace('/'s's*/',' ',strtolower($value));
}
// change each into lowercase and normalize spaces
$str = 'Here''s some example sentence using abdomen. Abdomen is a funny word and parental care is important.';
foreach ($terms as $term) {
// this will loop all the terms so it may take a while
// this looks like a requirement because you have multi-word terms in your list
$str = preg_replace('/'b('.$term.')'b/i', '<a href="/glossary/'.$term{0}.'/'.str_replace(' ','%20',$term).'">$1</a>', $str);
// regardless of case the link is assigned the lowercase version of the term.
// spaces are replaced by %20's
// --------------------
// ------- EDIT -------
// --------------------
// added 'b's around the term in regex to prevent, e.g.
// 'etc' in 'ketchup' from being caught.
}
编辑:检查代码中的最后一条注释。