我在PHP中使用不同的函数来帮助我计算单词、字符和阅读时间。但它们都有一个小"错误":函数计算所有内容-包括bbCode(带笑脸)。我不想那样!
function calculate_readingtime($string) {
$word = str_word_count(strip_tags($string));
$m = floor($word / 200);
$s = floor($word % 200 / (200 / 60));
$minutes = ($m != 0 ? $m.' min.' : '');
$seconds = (($m != 0 AND $s != 0) ? ' ' : '') . $s.' sec.';
return $minutes . $seconds;
}
$content = 'This is some text with [b]bbCode[/b]! Oh, so pretty :D And here''s is a link too: [url="https://example.com/"]das linkish[/url]. What about an image? That''s pretty to, you know. [img src="https://example.com/image.jpg" size="128" height="128" width="128"] And another one: [img src="https://example.com/image.jpg" height="128"]';
$reading_time = calculate_readingtime($content);
$count_words = str_word_count($content, 1, 'àáãâçêéíîóõôúÀÁÃÂÇÊÉÍÎÓÕÔÚÅåÄäÖö');
$count_chars_with_spaces = mb_strlen($content);
echo 'Reading time: '.$reading_time.'<br>';
echo 'Words: '.count($count_words).'<br>';
echo 'Characters with spaces: '.$count_chars_with_spaces;
# OUTPUT
Reading time: 16 sec.
Words: 55
Characters with spaces: 326
我希望计数器(包括阅读时间)更准确,不包括bbCode,但包括bbCode内的文本(例如:包括[b]bbCode[/b]
的文本bbCode
)。
我怎样才能做到这一点?
使用preg_replace
从字符串中解析BBCode实际上相对容易,特别是在像PHP这样支持PCRE库的语言中。假设您的BBCode语法有一些问题,下面是最短的方法:
preg_replace('@'[(?:'w+(?:="(?>.*?"))?(?: 'w+="(?>.*?"))*|/'w+)]@s', '', $content);
Regex101的演示
或者使用结束标签和嵌套的更好的方法:
function parse($str) {
return preg_replace_callback('@'[('w+)(?:="(?>.*?"))?(?: 'w+="(?>.*?"))*](?:(.*?)'[/'1])?@s',
function($matches) { return $matches[2] ? parse($matches[2]) : ''; },
$str
);
}
Demo on Ideone