正则表达式不起作用


Regular expression not working

我正试图从转发的电子邮件中获取电子邮件,并从中抄送,此时正文如下:

$body = '-------
Begin forwarded message:

From: Sarah Johnson <blabla@gmail.com>
Subject: email subject
Date: February 22, 2013 3:48:12 AM
To: Email Recipient <thatwouldbe@yayyy.com>
Cc: Ralph Johnson <johnson@gmail.com>

Hi,

hello, thank you and goodbye!
 blabla@gmail.com'

现在,当我做以下事情时:

$body = strtolower($body);
$pattern = '#from: 'D*'S(['w-'.]+)@((?:['w]+'.)+)([a-zA-Z]{2,4})'S#';
if (preg_match($pattern, $body, $arr_matches)) {
     echo htmlentities($arr_matches[0]);
     die();
}

我正确地得到:

from: sarah johnson <blabla@gmail.com>

为什么cc不起作用?我做了一些非常类似的事情,只是从改为cc:

$body = strtolower($body);
$pattern = '#cc: 'D*'S(['w-'.]+)@((?:['w]+'.)+)([a-zA-Z]{2,4})'S#';
if (preg_match($pattern, $body, $arr_matches)) {
     echo htmlentities($arr_matches[0]);
     die();
}

我得到:

cc: ralph johnson <johnson@gmail.com> hi, hello, thank you and goodbye! blabla@gmail.com

如果我从原始正文页脚中删除电子邮件(删除blabla@gmail.com)然后我正确地得到:

cc: ralph johnson <johnson@gmail.com>

看起来电子邮件正在影响正则表达式。但它是如何影响它的,为什么不从中影响它?我该怎么解决这个问题?

问题是'D*匹配得太多,也就是说它也匹配换行符。我在这里会更严格一些。为什么要使用'D(不是数字(?

例如[^@]*,它正在中工作

cc: [^@]*'S(['w-'.]+)@((?:['w]+'.)+)([a-zA-Z]{2,4})'S

在Regexr上看到它。

通过这种方式,您可以确保第一部分在电子邮件地址之外不匹配。

这个'D也是原因,它适用于第一个"From"情况。"日期"行中有数字,因此与此行不匹配。

像这样尝试

$body = '-------
Begin forwarded message:

From: Sarah Johnson <blabla@gmail.com>
Subject: email subject
Date: February 22, 2013 3:48:12 AM
To: Email Recipient <thatwouldbe@yayyy.com>
Cc: Ralph Johnson <johnson@gmail.com>

Hi,

hello, thank you and goodbye!
 blabla@gmail.com';
$pattern = '#(?:from|Cc):'s+[^<>]+<([^@]+@[^>'s]+)>#is';
preg_match_all($pattern, $body, $arr_matches);
echo '<pre>' . htmlspecialchars(print_r($arr_matches, 1)) . '</pre>';

输出

Array
(
    [0] => Array
        (
            [0] => From: Sarah Johnson <blabla@gmail.com>
            [1] => Cc: Ralph Johnson <johnson@gmail.com>
        )
    [1] => Array
        (
            [0] => blabla@gmail.com
            [1] => johnson@gmail.com
        )
)
$arr_matches[1][0] - "From" email
$arr_matches[1][1] - "Cc" email