我使用以下正则表达式将文本中的链接替换为可点击的链接:
preg_replace('/(http)+(s)?:('/'/)(('w|'.)+)('/)?('S+)?/i', '<a href="'0" target="_blank" class="lgray">'0</a>',$message);
我需要一个新的,它可以识别仅以 www 开头的链接以及以 http 开头的链接。以下是所需网址类型的列表:
- www.example.com/
- http://example.com/
- http://www.example.com/
- www.example.com
- http://example.com
- http://www.example.com
- https://example.com/
- https://www.example.com/
- https://example.com
- https://www.example.com
我尝试自己做,但我在正则表达式中不是很好。将不胜感激任何帮助。
谢谢!
PS:stackoverflow也无法识别仅以www开头的URL。
在您的正则表达式中,您已经将冒号和两个斜杠设置为强制性的。
这一行应该纠正:
preg_replace('/(http|https)?(:)?('/'/)?(('w|'.)+)('/)?('S+)?/i', '<a href="'0" target="_blank" class="lgray">'0</a>',$domains);
为了获得更好的答案,请尝试查看正则表达式模式以匹配带或不带 http://www 的 url
使用Claus Witt的链接并稍微修改一下就可以完成了这项工作。不过,他给出的preg_replace不起作用。这是我所做的:
$regex = "(((https?|ftp)':'/'/)|(www))";//Scheme
$regex .= "([a-z0-9-.]*)'.([a-z]{2,4})";//Host or IP
$regex .= "(':[0-9]{2,5})?";//Port
$regex .= "('/([a-z0-9+'$_-]'.?)+)*'/?";//Path
$regex .= "('?[a-z+&'$_.-][a-z0-9;:@&%=+'/'$_.-]*)?";//GET Query
$regex .= "(#[a-z_.-][a-z0-9+'$_.-]*)?";//Anchor
return str_replace
(
array('href="','http://http://','http://https://','http:///'),
array('href="http://','http://','https://','/'),
preg_replace('/'.$regex.'/i','<a href="'0" target="_blank" class="lgray">'0</a>',$message)
);
在修改中,我使http或www成为必需的,删除了一些不必要的检查,并将域扩展名从3个字符扩展到4个字符(.info也是一个域)。
免责声明:这些是非常基本的,不会考虑检查有效的TLD或文件扩展名。使用风险自负。
假设您不需要考虑目录或文件,要仅匹配那些没有子域的基本 URL,您可以使用以下正则表达式:
(?<=^|['n's])(?:https?:'/'/)?(?:www'.)?[a-zA-Z0-9-.]+'.com'/?(?=$|['n's])
#DESCRIPTION::
# (?<=^|['n's]) Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
# (?:https?:'/'/)? Matches http(s) if it is there
# (?:www'.)? Matches www. if it is there
# [a-zA-Z0-9-]+ Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
# '.com'/? Matches .com(/)
# (?=$|['n's]) Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.
如果你还需要匹配目录和文件,则需要稍微修改正则表达式的末尾并添加:
(?<=^|['n's])(?:https?:'/'/)?(?:www'.)?[a-zA-Z0-9-.]+'.com(?:(?:'/['w]+)+)?(?:'/|'.['w]+)?(?=$|['n's])
#DESCRIPTION::
# (?<=^|['n's]) Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
# (?:https?:'/'/)? Matches http(s) if it is there
# (?:www'.)? Matches www. if it is there
# [a-zA-Z0-9-.]+ Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
# '.com Matches .com
# (?: Start of a group
# (?:'/['w]+)+ Attempts to find subdirectories by matching /, then word characters
# )? Ends the previous group. This group can be skipped, if there are no subdirectories
# (?:'/|'.['w]+)? Matches a file extension if it is there, or a / if it is there.
# (?=$|['n's]) Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.
试试这个:
$pattern = preg_replace("/((https:'/'/|http:'/'/||http:'/'/www.|https:'/'/www.|www.)+(['w'/])+(.com'/|.com))/i","<a target='"_blank'" href='"$1'">$1</a>",$url);