我正在处理上传 html 代码的代码,并将相同的代码添加为内容,顶部字符是标题和 SEO URL。
但是我在制作标题时遇到了问题,因为无法仅从 HTML 字符串中获取纯文本以将其用作标题和 SEO URL
以下是我从HTML文本获取标题的代码:
$title = getplaintextintrofromhtml($str,100);
$title = str_replace(PHP_EOL, '', $title);
$title = str_replace(" "," ", $title);
$title = str_replace(str_split('''/:*?"<>|,+=-'), '', $title);
$title = str_replace("'","", $title);
$title = str_replace("<br>","", $title);
$title = str_replace("'n","", $title);
$title = trim($title);
搜索引擎优化网址 $newurltitle=str_replace(" ","-",$title);
和功能
function getplaintextintrofromhtml($html, $numchars) {
// Remove the HTML tags
$html = strip_tags($html);
// Convert HTML entities to single characters
$html = html_entity_decode($html, ENT_QUOTES, 'UTF-8');
// Make the string the desired number of characters
// Note that substr is not good as it counts by bytes and not characters
$html = mb_substr($html, 0, $numchars, 'UTF-8');
// Add an elipsis
return $html;
}
即使在我上面的代码之后,我也得到了带有新行的标题,我也无法弄清楚为什么会发生这种情况,甚至认为我得到了纯文本,但仍然存在新行等问题,我也不能使用它们来制作 SEO URL
可以使用以下代码删除换行符、多余空格和换行符:
$title = preg_replace('/'s+/', ' ', $title);