用 PHP 替换文本中的链接


Replacing links in text with PHP

我使用以下正则表达式将文本中的链接替换为可点击的链接:

preg_replace('/(http)+(s)?:('/'/)(('w|'.)+)('/)?('S+)?/i', '<a href="'0" target="_blank" class="lgray">'0</a>',$message);

我需要一个新的,它可以识别仅以 www 开头的链接以及以 http 开头的链接。以下是所需网址类型的列表:

  • www.example.com/
  • http://example.com/
  • http://www.example.com/
  • www.example.com
  • http://example.com
  • http://www.example.com
  • https://example.com/
  • https://www.example.com/
  • https://example.com
  • https://www.example.com

我尝试自己做,但我在正则表达式中不是很好。将不胜感激任何帮助。

谢谢!

PS:stackoverflow也无法识别仅以www开头的URL。

在您的正则表达式中,您已经将冒号和两个斜杠设置为强制性的。

这一行应该纠正:

preg_replace('/(http|https)?(:)?('/'/)?(('w|'.)+)('/)?('S+)?/i', '<a href="'0" target="_blank" class="lgray">'0</a>',$domains);

为了获得更好的答案,请尝试查看正则表达式模式以匹配带或不带 http://www 的 url

使用Claus Witt的链接并稍微修改一下就可以完成了这项工作。不过,他给出的preg_replace不起作用。这是我所做的:

$regex = "(((https?|ftp)':'/'/)|(www))";//Scheme
$regex .= "([a-z0-9-.]*)'.([a-z]{2,4})";//Host or IP
$regex .= "(':[0-9]{2,5})?";//Port
$regex .= "('/([a-z0-9+'$_-]'.?)+)*'/?";//Path
$regex .= "('?[a-z+&'$_.-][a-z0-9;:@&%=+'/'$_.-]*)?";//GET Query
$regex .= "(#[a-z_.-][a-z0-9+'$_.-]*)?";//Anchor
return str_replace
(
    array('href="','http://http://','http://https://','http:///'),
    array('href="http://','http://','https://','/'),
    preg_replace('/'.$regex.'/i','<a href="'0" target="_blank" class="lgray">'0</a>',$message)
);

在修改中,我使httpwww成为必需的,删除了一些不必要的检查,并将域扩展名从3个字符扩展到4个字符(.info也是一个域)。

免责声明:这些是非常基本的,不会考虑检查有效的TLD或文件扩展名。使用风险自负。

假设您不需要考虑目录或文件,要仅匹配那些没有子域的基本 URL,您可以使用以下正则表达式:

(?<=^|['n's])(?:https?:'/'/)?(?:www'.)?[a-zA-Z0-9-.]+'.com'/?(?=$|['n's])
#DESCRIPTION::
#  (?<=^|['n's])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:'/'/)?        Matches http(s) if it is there
#  (?:www'.)?              Matches www. if it is there
#  [a-zA-Z0-9-]+           Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  '.com'/?                Matches .com(/)
#  (?=$|['n's])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

如果你还需要匹配目录和文件,则需要稍微修改正则表达式的末尾并添加:

(?<=^|['n's])(?:https?:'/'/)?(?:www'.)?[a-zA-Z0-9-.]+'.com(?:(?:'/['w]+)+)?(?:'/|'.['w]+)?(?=$|['n's])
#DESCRIPTION::
#  (?<=^|['n's])           Checks to see that what's preceding the URL is the beginning of the string, or a newline, or whitespace.
#  (?:https?:'/'/)?        Matches http(s) if it is there
#  (?:www'.)?              Matches www. if it is there
#  [a-zA-Z0-9-.]+          Matches "example" in "example.com" (as well as any other valid URL character; will also match subdomains)
#  '.com                   Matches .com
#  (?:                     Start of a group
#     (?:'/['w]+)+         Attempts to find subdirectories by matching /, then word characters
#  )?                      Ends the previous group. This group can be skipped, if there are no subdirectories
#  (?:'/|'.['w]+)?         Matches a file extension if it is there, or a / if it is there.
#  (?=$|['n's])            Checks to see that what's following the URL is the end of the string, or a newline, or whitespace.

试试这个:

$pattern = preg_replace("/((https:'/'/|http:'/'/||http:'/'/www.|https:'/'/www.|www.)+(['w'/])+(.com'/|.com))/i","<a target='"_blank'" href='"$1'">$1</a>",$url);