如何在不干扰HTML标记的情况下从php字符串中剥离字符


How to strip characters from a php string without disturbing HTML tags

我有一个包含html标签的字符串,如下所示。

$desc = "<p>Lorem <strong>ipsum</strong> dolor sit amet</p><p>Duo at agam maiorum instructior, ut tale quidam ancillae qui, est cu paulo consetetur.</p>"

我想取前10个字符,这样:

  1. HTML标签不被计算。
  2. 所有打开的HTML标签都正确关闭。

现在使用substr:

$result = substr($desc, 0, 10);

实际结果为:<p>Lorem <

我想要的是:<p>Lorem <strong>ipsu</strong></p>

我已经从这里使用一个非常好的代码实现了这一点如何关闭未关闭的HTML标签?kamal的回答

<?php
 $str = "<p>Lorem <strong>ipsum</strong> dolor sit amet</p><p>Duo at agam maiorum instructior, ut tale quidam ancillae qui, est cu paulo consetetur.</p>";
 $s = strip_tags($str);
 $result = substr($s, 0, 10);
 $sarr = explode(' ', $result);
 $last = end($sarr);
 $l = strpos($str, $last);
 $r = substr($str, 0, $l);
 echo closetags($r.$last);
 function closetags ( $html )
    {
    #put all opened tags into an array
    preg_match_all ( "#<([a-z]+)( .*)?(?!/)>#iU", $html, $result );
    $openedtags = $result[1];
    #put all closed tags into an array
    preg_match_all ( "#</([a-z]+)>#iU", $html, $result );
    $closedtags = $result[1];
    $len_opened = count ( $openedtags );
    # all tags are closed
    if( count ( $closedtags ) == $len_opened )
    {
    return $html;
    }
    $openedtags = array_reverse ( $openedtags );
    # close tags
    for( $i = 0; $i < $len_opened; $i++ )
    {
        if ( !in_array ( $openedtags[$i], $closedtags ) )
        {
        $html .= "</" . $openedtags[$i] . ">";
        }
        else
        {
        unset ( $closedtags[array_search ( $openedtags[$i], $closedtags)] );
        }
    }
    return $html;
}
?>