正则表达式计算非嵌套</p>标签


Regex to count non nested </p> tags

下面的函数几乎完成了它所说的。它在找到的第二段标签后面插入一个html字符串。

我需要稍微改变一下,这样它只计算不在其他标签内的段落标签。换句话说,只有顶级段落标签。

有办法用正则表达式吗?

function my_html_insert($content){
    $InsertAfterParagraph = 2;
    if(substr_count(strtolower($content), '</p>') < $InsertAfterParagraph )
    {
        return $content .= myFunction($my_insert=1);
    }
    else
    {
        $replaced_content = preg_replace_callback('#(<p['s>].*?</p>'n)#s', 'my_p_callback', $content);
    }
    return $replaced_content;
}

function my_p_callback($matches)
{
    static $count = 0;
    $ret = $matches[1];
    $pCount = get_option('my_p_count');
    if (++$count == $pCount){
        $ret .= myFunction($my_insert=1);
    }
    return $ret;
}

我仍然会解析它,因为它更干净,更容易维护:

<?php
$doc = new DOMDocument();
$doc->loadHTML("
    <!DOCTYPE html>
    <html>
        <body>
            <p>Test 1</p>
            <div>Test <p>2</p></div>
            <p>Test <span>3</span></p>
        </body>
    </html>
");
$xpath = new DOMXpath($doc);
$elements = $xpath->query("/html/body/p");
foreach ($elements as $element) {
    $node = $doc->createDocumentFragment();
    $node->appendXML('<h1>This is a test</h1>');
    if ($element->nextSibling) {
        $element->parentNode->insertBefore($node, $element->nextSibling);
    } else {
        $element->parentNode->appendChild($node);
    }
}
echo $doc->saveHTML();
?>

输出:

<!DOCTYPE html>
<html>
    <body>
        <p>Test 1</p><h1>This is a test</h1>
        <div>Test <p>2</p></div>
        <p>Test <span>3</span>t</p><h1>This is a test</h1>
    </body>
</html>