简单正则表达式中同一行上有多个匹配项的问题


Problem with multiple matches on same line in simple regular expression

我有一个关于正则表达式的非常基本的问题。我正在尝试匹配和替换如下URL:

http://mydomain.com/image/13/imagetitle.html

对此,我使用以下表达式:

/mydomain.com(.*)image'/('d+)'/(.*).html/

这种模式基本上可以正常工作,但当多个引用出现在同一行时,它就不起作用了。所以这是有效的:

This is my own image: http://mydomain.com/image/13/imagetitle.html

当包括跨行的多个事件时,它也起作用:

This is my own image: http://mydomain.com/image/13/imagetitle.html
Yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html

两个引用匹配并被正确替换。然而,当同一行出现两次时,它只会替换第一次匹配:

This is my own image: http://mydomain.com/image/13/imagetitle.html, yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html

如何确保所有匹配项都被替换,而不考虑新行?

我也没有遇到这个问题。但从正则表达式来看,你的问题很可能是贪婪。

(.*)尽可能多地匹配。如果两个URL在同一行,它将同时捕获两个URL。因此,通常您希望使用(.*?),或应用不自由/U标志。

但在你的情况下,我建议你简单地让匹配更具体:

/mydomain.com('S*)image'/('d+)'/('S*).html/

在这里,'S将只匹配任何不是空白的内容,因为这肯定是URL应该被分解的地方。作为替代方案,您可以使用更具体的字符类,如(['w/.?&#%=-]*),而不是.*?

您的模式正在运行。我已经通过以下代码进行了测试:

$data = "This1 is my own image: http://mydomain.com/image/13/imagetitle.html, yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html
This2 is my own image: http://mydomain.com/image/13/imagetitle.html, yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html
This3 is my own image: http://mydomain.com/image/13/imagetitle.html, yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html
This4 is my own image: http://mydomain.com/image/13/imagetitle.html, yet I recommend this one as well: image: http://mydomain.com/image/15/imagetitle2.html
";
echo preg_replace('/mydomain.com(.*)image'/('d+)'/(.*).html/', 'replaced one', $data);