PHP/Regex.如何替换域后url中的所有百分比(%）字符 - PHP/Regex. How to replace all percent (%) chars in url after domain?

我遇到了一个问题，我试图为preg_replace创建一个正则表达式，用域名后url中的"_"字符替换所有百分比（%）字符（在域路径中）。

示例：

This is c%ontent with 1 url her%e http://example.com/this%is%image.jpg
and 1 url here http://anotherexample.com/t%his%is%image2.jpg

结果：

This is c%ontent with 1 url her%e http://example.com/this_is_image.jpg
and 1 url here http://anotherexample.com/t_his_is_image2.jpg

我的问题是：如何使用preg_replace做到这一点？

我所拥有的只是用于在img标签中选择域的正则表达式：

/<img [^>]*src="([^"]+example'.com'/[^"]+)"[^>]*>/

如果您处理正则表达式替换，我建议使用preg_replace_callback

http://php.net/manual/en/function.preg-replace-callback.php

请记住，在url中替换%可能很危险，因为url可能有一些有效的%字符，如http://foo.bar/here%20/index.html，其中%20是空白

示例

$haystack = 'This is c%ontent with 1 url her%e http://example.com/this%is%image.jpg 
and 1 url here http://anotherexample.com/t%his%is%image2.jpg';
// please use your fav url regex here
$urlRegex = '#'bhttps?://[^'s()<>]+(?:'(['w'd]+')|([^[:punct:]'s]|/))#';
$haystack = preg_replace_callback($urlRegex, function($url){
    return str_replace('%', '_', $url[0]);
}, $haystack);

您可以用一个简单的正则表达式匹配字符串中的URL：

// $subject is the string    
preg_match_all('/http:'/'/[^'s]+/', $subject, $matches);

然后循环匹配，将URL中的%替换为_，并将其替换为原始$subject:

foreach ($matches as $match) {
    $search = $match;
    $replace = str_replace('%', '_', $match);
    $subject = str_replace($search, $replace, $subject);
}

像这样使用dirname()、basename()和str_replace()怎么样：

$haystack = 'This is c%ontent with 1 url her%e http://example.com/this%is%image.jpg';
$result = dirname($haystack) . '/' . str_replace('%','_',basename($haystack));
echo $result;

结果：

This is c%ontent with 1 url her%e http://example.com/this_is_image.jpg

这将比使用preg_replace()和正则表达式效率高得多。

更新：

正如ins0所指出的，上面的答案取决于只包含一个位于末尾的url的字符串。不是很灵活。这是基于我在上面发布的另一个想法：

$haystack = 'This is c%ontent with 1 url her%e http://example.com/this%is%image.jpg 
and 1 url here http://anotherexample.com/t%his%is%image2.jpg';
$parts = explode(' ',$haystack);
foreach ($parts as &$part) {
    if (strpos($part,'http://') !== false || strpos($part,'https://') !== false) {
        $part = dirname($part) . '/'. str_replace('%','_',basename($part));
    }
}
$haystack = implode(' ',$parts);
echo $haystack;

结果：

This is c%ontent with 1 url her%e http://example.com/this_is_image.jpg
and 1 url here http://anotherexample.com/t_his_is_image2.jpg

这没有@ins0的答案那么优雅，但我提出了另一个解决方案，我通常不使用php进行编码，所以这可能不是最理想的。如果可以改进，请发表评论。

$str3 = "This is c%ontent with 1 url her%e http://example.com/this%is%image.jpg and 1 url here http://anotherexample.com/t%his%is%image2.jpg ";
$regex = "(http''S+(''s|$))";
$unmatched = preg_split($regex, $str3);
preg_match_all($regex, $str3, $matches);
$substituted = (str_replace("%", "_", $matches[0]));
$result = "";
foreach($substituted as $key=>$value) {
    $result .= $unmatched[$key];
    $result .= $substituted[$key];
}  
print $result; # for testing