Regex匹配图像,但不在img标记内


Regex match images but not inside img tag

我有一个函数,可以将所有外部图像链接转换为字符串中的img标记。它运行良好,但也匹配<img>标签内的链接

例如:

$text = '<p>lorem ipsum http://example.jpg <img src="example.jpg"></p>';
echo make_clickable($text);
function make_clickable($text) {
    $replace = '<p class="update-image"><a href="$0" target="_blank"><img src="$0"></a></p>';
    $text = preg_replace('~https?://[^/'s]+/'S+'.(jpg|png|gif)~i', $replace, $text );
    return $text;
}

此测试将同时匹配纯文本和src。有没有办法排除img标签?

您可以使用一些不知名的regex功能:

<img[^>]*>(*SKIP)(*FAIL)|https?://[^/'s]+/'S+'.(?:jpg|png|gif)

让我们稍微解释一下模式:

<img                # match a literal <img
[^>]*               # match anything except > zero or more times
>                   # match a literal >
(*SKIP)(*FAIL)      # make it fail
|                   # or
https?              # match http or https
://                 # match a literal ://
[^/'s]+             # match anything except white-space and forward slash one or more times
/                   # match a literal /
'S+                 # match a non-white-space one or more times
'.                  # match a literal dot
(?:jpe?g|png|gif)   # match jpg, jpeg, png, gif
                    # Don't forget to set the i modifier :)

这个想法是匹配img标签并跳过它,同时匹配所有这些URI。

在线演示

$replace = '<p class="update-image"><a href="$1" target="_blank"><img src="$1"></a></p>';
$text = preg_replace('~(https?://[^/'s]+/'S+'.(jpg|png|gif))(?:'s|$|'n)~i', $replace, $text );

EDIT:您的正则表达式似乎与我的测试中的任何内容都不匹配,但我在最后添加的是(?:'s|$|'n)。我希望你能明白。