非贪婪正则表达式 - Non greedy regex

Non greedy regex

本文关键字：正则表达式贪婪 | 更新日期: 2023-09-27

我需要在注释php文件中的一些标签中获取值，就像这样

php code
/* this is a comment
!-
<titulo>titulo3</titulo>
<funcion>
   <descripcion>esta es la descripcion de la funcion 6</descripcion>
</funcion>
<funcion>
   <descripcion>esta es la descripcion de la funcion 7</descripcion>
</funcion>
<otros>
   <descripcion>comentario de otros 2a hoja</descripcion>
</otros>
-!
*/
some php code

所以你可以看到文件有换行符和重复的标签，如<funcion></funcion>，我需要得到每一个标签，所以我正在尝试这样的东西:

preg_match_all("/(<funcion>)(.*)(<'/funcion>)/s",$file,$matches);

这个例子与换行符一起工作，但它太贪婪了，所以我一直在搜索，看到了这两个解决方案:

preg_match_all("/(<funcion>)(.*?)(<'/funcion>)/s",$file,$matches);
preg_match_all("/(<funcion>)(.*)(<'/funcion>)/sU",$file,$matches);

但是没有一个适合我，不知道为什么

问题中的表达:

preg_match_all("/(<funcion>)(.*?)(<'/funcion>)/s", $file, $matches);
print_r($matches);

这将工作，但只有当$file是一个包含XML;如果是文件名，则必须先获取内容:

preg_match_all("/(<funcion>)(.*?)(<'/funcion>)/s", file_get_contents($file), $matches);

另外，请记住，当您使用非贪婪模式时，PCRE具有回溯限制。

试试这个…

 /<funcion>((.|'n)*?)<'/funcion>/i

如

$srting = "<titulo>titulo3</titulo>
<funcion>
   <descripcion>esta es la descripcion de la funcion 6</descripcion>
</funcion>
<funcion>
   <descripcion>esta es la descripcion de la funcion 7</descripcion>
</funcion>
<otros>
   <descripcion>comentario de otros 2a hoja</descripcion>
</otros>";
$result=preg_match_all('/<funcion>((.|'n)*?)<'/funcion>/i', $srting,$m);
print_r($m[0]);

这个输出

Array
(
    [0] => 
   esta es la descripcion de la funcion 6
    [1] => 
   esta es la descripcion de la funcion 7
)

。如果结构与此完全相同(总是在内容内部缩进)，您可以轻松地使用/'n['s]+([^'n]+('n['s]+)*)'n/。

。我总是倾向于避免使用"懒惰"("非贪婪")修饰语。它只是看起来像一个hack，并且它不是在所有地方都可用并且具有相同的实现。因为在这种情况下，你似乎不需要它，我建议你不要使用它。

。试试这个:

$regexp = '/<funcion>'n['s]+([^'n]+('n['s]+)*)'n</funcion>/';
$works = preg_match_all($regexp, $file, $matches);
echo '<pre>';
print_r($matches);

。"$matches[1]"数组将为您提供一个" function "标签内容的数组。

。当然，最好预先过滤内容，并仅对注释内容应用RegExp，以避免任何不匹配。

。获得乐趣。

尝试使用['s'S]，这意味着所有的空格和非空格字符，而不是.。此外，不需要在匹配组中添加<funcion>和</funcion>。

/<funcion>(['s'S]*?)<'/funcion>/s

还要记住，最好的方法是使用XML解析器解析XML。即使它不是XML文档，正如您在评论中提到的，提取应该解析的部分并使用XML解析器对其进行解析。