使用phpregex从电子邮件中读取链接时出现问题


problems reading links from email with php regex

我遇到了一个非常奇怪的问题,我似乎无法解决。

我有一个脚本,它读取一封电子邮件,从电子邮件中获取一个用户名和一个链接(或多个链接),并将其放入一个数组中。由于某种原因,链接不断被截断,因为某个"="不断被添加。当我在电子邮件上执行字符串替换时,在执行regex之前,它不会替换"="。你知道这个问题可能是什么吗??

以下是示例电子邮件:

 @bill
 http://techcrunch.com/2012/07/20/kickstarter-flashr-wants-to-make-the-iphones-bezel-a-massive-notification-light/?grcc=88888Z0ZwdgtZ0Z0Z0Z0Z0&grcc2=835637c33f965e6cdd34c87219233711~1342828462249~fca4fa8af1286d8a77f26033fdeed202~510f37324b14c50a5e9121f955fac3fa~1342747216490~0~0~0~0~0~0~0~0~7~3~

当我回显消息的正文时,我会得到:

 --00248c6a671acfdb9c04c558d753 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable @bill http://techcrunch.com/2012/07/20/kickstarter-flashr-wants-to-make-the-iphon= es-bezel-a-massive-notification-light/?grcc=3D88888Z0ZwdgtZ0Z0Z0Z0Z0&grcc2= =3D835637c33f965e6cdd34c87219233711~1342828462249~fca4fa8af1286d8a77f26033f= deed202~510f37324b14c50a5e9121f955fac3fa~1342747216490~0~0~0~0~0~0~0~0~7~3~ --00248c6a671acfdb9c04c558d753 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable @bill

请注意断开链接的"="。我的正则表达式生成:

 Array ( [0] => http://techcrunch.com/2012/07/20/kickstarter-flashr-wants-to-make-the-iphon= [1] => http://techcrunch.com/2012/07/2= [2] => http://techcrunch.com/2012= ) 

当我复制并粘贴字符串并通过字符串替换运行它时,它会替换"="

知道发生了什么事吗?

感谢

文本采用名为"Quoted Printable"的编码。首先将其解码为正常文本:

http://php.net/manual/en/function.quoted-printable-decode.php