截断文本而不截断HTML


Truncate text without truncating HTML

此字符串包含78个HTML字符,39个不包含HTML:字符

<p>I really like the <a href="http://google.com">Google</a> search engine.</p>

我想根据非HTML字符数截断这个字符串,所以例如,如果我想将上面的字符串截断为24个字符,输出将是:

I really like the <a href="http://google.com">Google</a>

截断在确定要截断的字符数时没有考虑html,它只考虑了剥离的计数。然而,它并没有留下打开的HTML标记。

好吧,这就是我所做的,它似乎正在工作:

function truncate_html($string, $length, $postfix = '&hellip;', $isHtml = true) {
    $string = trim($string);
    $postfix = (strlen(strip_tags($string)) > $length) ? $postfix : '';
    $i = 0;
    $tags = []; // change to array() if php version < 5.4
    if($isHtml) {
        preg_match_all('/<[^>]+>([^<]*)/', $string, $tagMatches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
        foreach($tagMatches as $tagMatch) {
            if ($tagMatch[0][1] - $i >= $length) {
                break;
            }
            $tag = substr(strtok($tagMatch[0][0], " 't'n'r'0'x0B>"), 1);
            if ($tag[0] != '/') {
                $tags[] = $tag;
            }
            elseif (end($tags) == substr($tag, 1)) {
                array_pop($tags);
            }
            $i += $tagMatch[1][1] - $tagMatch[0][1];
        }
    }
    return substr($string, 0, $length = min(strlen($string), $length + $i)) . (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : '') . $postfix;
}

用法:

truncate_html('<p>I really like the <a href="http://google.com">Google</a> search engine.</p>', 24);

该功能是从获取的(做了一个小修改)

http://www.dzone.com/snippets/truncate-text-preserving-html