我只想匹配以下所有URL中的文件扩展名直到问号。因此URL#4将匹配"file.pdf"中的pdf,但不匹配"otherfile.exe"中的"exe"。
http://www.someplace.com/directory/file.pdf
http://www.someplace.com/directory/file.pdf?otherstuff=true
http://www.someplace.com/directory/file.pdf?other=true&more=false
http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe
我该怎么做?
我试过了,但不起作用:
([^'.]+)('?|[^'?]$)+
这将是我使用的版本
/'w+'.[A-Za-z]{3,4}(?='?|$)/
这是一个工作版本:
http://regex101.com/r/sY2fR0/1
使用"向前看"?或者字符串(?='?|$)
的末尾,您就可以匹配它后面的内容
$re = "/''w+''.[A-Za-z]{3,4}(?=''?|$)/";
$str = "http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe'n'n";
preg_match($re, $str, $matches);
对于匹配,请尝试此不区分大小写的函数:
function matchURLs($desiredURL, $compareURL){
$url = parse_url($compareURL);
if(preg_match('/^'$url['scheme'].'://'.$url['host'].$url['path'].'$/i', $desiredURL)){
return true;
}
return false;
}
matchURLs('http://www.someplace.com/directory/file.pdf', 'http://www.someplace.com/directory/file.exe'); // false
matchURLs('http://www.someplace.com/directory/file.pdf', 'http://www.someplace.com/directory/file.pdf?value=file.exe'); // true
仅在?
:之前获取
function URL_before_query($url){
$u = parse_url($url);
return $u['scheme'].'://'.$u['host'].$u['path'];
}
echo URL_before_query('http://www.someplace.com/directory/file.pdf?other=true&more=false&value=otherfile.exe'); // http://www.someplace.com/directory/file.pdf
<?
$str = '
http://www.someplace.com/directory/file1.pdf
http://www.someplace.com/directory/file2.pdf?otherstuff=true
http://www.someplace.com/directory/file3.pdf?other=true&more=false
http://www.someplace.com/directory/file4.pdf?other=true&more=false&value=otherfile.exe
';
$regex= '~.*/'K[^?'n]+~';
preg_match_all($regex, $str, $out, PREG_SET_ORDER);
print_r($out);
?>
输出
Array (
[0] => Array (
[0] => file1.pdf
)
[1] => Array (
[0] => file2.pdf
)
[2] => Array (
[0] => file3.pdf
)
[3] => Array (
[0] => file4.pdf
)
)