PHP:查找所有链接或链接文本字符串并附加查询字符串值


PHP: Find all links or link text string and append query string values

我正在尝试在给定字符串(电子邮件正文)中查找所有链接(或仅链接文本),并在所有URL中附加自定义查询字符串值(Google链接跟踪)。

我以这个为例:

$html = <<< S
<html><body><p></p><div align="center"><img
src="https://domain.com/assets/uploads/291c7977c3b2dc87cdfd77533aa95d25.png"></div><br><br>Hello&nbsp;<strong></strong>,&nbsp;<br><br>Type
your message
here...<br><br>https://domain.com/qa/<br><br><br>Thanks</body></html>
S;
$dom = new DOMDocument;
$dom->loadHTML($html);
$anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');
foreach($anchors as $anchor) {
    $href = $anchor->getAttribute('href');
    $url = parse_url($href);
    $attach = 'stackoverflow=true'; // attach this to all urls
    if (isset($url['query'])) {
        $href .= '&' . $attach;
    } else {
        $href .= '?' . $attach;
    }
    $anchor->setAttribute('href', $href);
}
echo $dom->saveHTML();

但是,链接不会被替换。在这种情况下,我希望能够将stackoverflow=true附加到给定字符串中的所有链接,但这并没有发生。

任何帮助将不胜感激。谢谢

好的,

我想出了解决方案。我首先需要链接所有文本链接,然后使用 DOM 进行追加工作。这是修改后的代码:

$html = <<< S
<html><body><p></p><div align="center"><img
src="https://domain.com/assets/uploads/291c7977c3b2dc87cdfd77533aa95d25.png"></div><br><br>Hello&nbsp;<strong></strong>,&nbsp;<br><br>Type
your message
here...<br><br>https://domain.com/qa/<br><br><br>Thanks</body></html>
S;
// first linkify any non-links
$s = preg_replace(
   "/(?<!a href='")(?<!src='")((https?|ftp)+(s)?:'/'/[^<>'s]+)/i",
   "<a href='"''0'">''0</a>",
   $body
);
// now find links and append custom query string values
$dom = new DOMDocument;
$dom->loadHTML($s);
$anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');
foreach($anchors as $anchor) {
    $href = $anchor->getAttribute('href');
    $url = parse_url($href);
    $attach = 'stackoverflow=true'; // attach this to all urls
    if (isset($url['query'])) {
        $href .= '&' . $attach;
    } else {
        $href .= '?' . $attach;
    }
    $anchor->setAttribute('href', $href);
}
echo $dom->saveHTML();

所以我只在顶部添加了这部分代码:

// first linkify any non-links
$s = preg_replace(
   "/(?<!a href='")(?<!src='")((https?|ftp)+(s)?:'/'/[^<>'s]+)/i",
   "<a href='"''0'">''0</a>",
   $body
);