将大写字母 H1、H2 变为大写,..标记为带有 PHP 的大写标题


Turn uppercase H1, H2,... tags into capitalized title with PHP

我想把大写的h1,h2变成大写,...标记转换为带有 PHP 的大写文本。我很接近,但还没有。下面的代码片段不会将"LOREM"的第一个字符转换为大写(可能是因为它尝试将"<"大写)。修改回调 PHP 函数很容易,但我希望我可以通过只修改正则表达式部分来做到这一点:

$var = "
<h1>LOREM IPSUM DOLORES AMET</h1>
THIS IS SOME TEXT
<H2>LOREM IPSUM DOLORES AMET</H2>";
$line = preg_replace_callback(
    '/<h[1-9]>(.*)'>/i',
    function ($matches) {
        return ucfirst(strtolower($matches[0]));
    },
    $var
);
print($line);

结果:

<h1>lorem ipsum dolores amet</h1>
THIS IS SOME TEXT
<H2>lorem ipsum dolores amet</H2>

期望输出:

<h1>Lorem ipsum dolores amet</h1>
THIS IS SOME TEXT
<H2>Lorem ipsum dolores amet</H2>

您使用 $matches[0] 返回整个匹配项。在这种情况下,请使用环顾四周。

我建议在第一个<h...>标签中使用捕获组,以便您可以将其用作反向引用;因此,您将匹配与该组匹配的相同结束标签。

$text = preg_replace_callback('~<h([1-9])>'K[^<]++(?=</h'1>)~i', 
      function($m) {
         return ucfirst(strtolower($m[0]));
      }, $text);

工作演示

虽然您可以使用正则表达式执行此操作,但我建议为此使用 DOM

$doc = DOMDocument::loadHTML('
    <h1>LOREM IPSUM DOLORES AMET</h1>
    THIS IS SOME TEXT
    <H2>LOREM IPSUM DOLORES AMET</H2>
');
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//h1|//h2|//h3|//h4|//h5|//h6');
foreach ($nodes as $node) {
  $node->nodeValue = ucfirst(strtolower($node->nodeValue));
}
echo $doc->saveHTML(); 

使用DOMDocument

<?php
        $var = "
<h1>LOREM IPSUM DOLORES AMET</h1>
THIS IS SOME TEXT
<H2>LOREM IPSUM DOLORES AMET</H2>";
        $dom = new DOMDocument();
        $dom->loadHTML($var);
        $tags = array("h1", "h2");
        //loop thru all h1 and h2 tags
        foreach ($tags as $tag) {
            //get all elements of the current tag
            $elements = $dom->getElementsByTagName($tag);
            //if we found at least 1 element
            if (!empty($elements)) {
                //loop thru each element of the given tag
                foreach ($elements as $element) {
                    //run ucfirst on the nodevalue
                    //which is equivalent to the "textContent" property of a DOM node
                $element->nodeValue = ucfirst(strtolower($element->nodeValue));
                }
            }
        }
$html = $dom->saveHTML();
//remove extra markup
$html = str_replace("</body></html>","",substr($html,strpos($html,"<h1>"));
echo $html;
<h1>Lorem ipsum dolores amet</h1>
THIS IS SOME TEXT
<h2>Lorem ipsum dolores amet</h2>
不是

$matches[0],而是$matches[1]matches[0] 是指整个匹配项(即,ucfirststrtolower函数适用于整个匹配项),而$matches[1]是指组索引 1 中存在的字符。因为我们在正则表达式中包含<h[1-9]>,所以它与起始<h>标签匹配。但是在替换部分中,我们只包括组索引 1,如 ucfirst(strtolower($matches[1])) .因此,删除了起始<h>标签。请参阅以下示例。

$var = "
<h1>LOREM IPSUM DOLORES AMET</h1>
THIS IS SOME TEXT
<H2>LOREM IPSUM DOLORES AMET</H2>";
$line = preg_replace_callback(
    '/<h[1-9]>(.*)'>/i',
    function ($matches) {
        return ucfirst(strtolower($matches[1]));
    },
    $var
);
print($line);

输出:

Lorem ipsum dolores amet</h1
THIS IS SOME TEXT
Lorem ipsum dolores amet</h2

但是上面也取代了第一个<h1>标签。因此,我向您推荐以下适用于strtolowerucfirst功能仅适用于<h>标签中的部分。

$var = "
<h1>LOREM IPSUM DOLORES AMET</h1>
THIS IS SOME TEXT
<H2>LOREM IPSUM DOLORES AMET</H2>";
$line = preg_replace_callback(
        '/<h[1-9]>'K.*?(?=<)/i',
        function ($matches) {
            return ucfirst(strtolower($matches[0]));
        },
        $var
);
print($line);

输出:

<h1>Lorem ipsum dolores amet</h1>
THIS IS SOME TEXT
<H2>Lorem ipsum dolores amet</H2>

'K在决赛中丢弃先前匹配的字符。 .*?会对任何字符进行零次或多次的非贪婪匹配,(?=<)直到文字<符号。

不需要正则表达式。强制性链接。不要使用正则表达式来解析 HTML。曾。

演示

<?php
$HTMLString = <<<HTML
<h1>lorem ipsum dolores amet</h1>
THIS IS SOME TEXT
<h2>lorem ipsum dolores amet</h2>
HTML;
$doc = new DOMDocument();
$doc->loadHTML($HTMLString);
//You can also use xpath. Loop results after using this instead:
//$xpath = new DOMXPath($doc);
//$nodeList= $xpath->query(//h2);
$nodeList = $doc->getElementsByTagName('h2');
foreach ($nodeList as $node) {
    $stringArray = explode(' ', $node->nodeValue);
    $stringArray[0] = ucfirst($stringArray[0]);
    $capitalizedSentence = implode(' ', $stringArray);
    echo $capitalizedSentence;
}

寄件人:

lorem ipsum dolores amet

致:

Lorem ipsum dolores amet