在相关收件人字段中找到用于捕获电子邮件地址的Javascript和PHP正则表达式


Javascript and PHP Regular Expressions to Capture Email Address found in related recipient fields

我正在尝试开发两个正则表达式,一个在javascript中,另一个在php中,它将捕获原始电子邮件中的电子邮件地址,这些地址只属于它各自的字段(例如to:)和它的字段(即语料库中没有来自其他任何地方的其他电子邮件),但我没有成功。

以下是理想的要求:

  • 必须从新的行开始,并从该行的开头开始。

  • 例如,新行必须以"To"开头(不包括双引号,不区分大小写,冒号的单次出现是可选的,空格的单个或无限数量是可选的)。

  • 此后,必须单独捕获所有电子邮件地址,直到最后一个电子邮件地址,但在非电子邮件地址单词之前(非电子邮件地址词的示例,但不具体:主题:、发件人:、抄送:、你好等)

我在第1和第2项要求上取得了成功,但在第3项要求上却举步维艰。我被迫简单地求解#1和#2,并简单地根据逗号拆分/分解结果,这很好,但我知道可以做得更好。

以下是来自安然电子邮件的pulic数据集的电子邮件示例

Message-ID: <3470405.1075840065684.JavaMail.evans@thyme>
Date: Sun, 14 Feb 1999 01:33:00 -0800 (PST)
From: markskilling@hotmail.com
To: majalinda@hotmail.com, ksbiehl@hotmail.com, dlmackler@worldnet.att.net, 
    cjones@cityofnapa.org, hazerfen@hotmail.com, meyerjames@usa.net, 
    tomskilljr@aol.com, c.combs@intershop.com, mshachat@aol.com, 
    clowes@email.msn.com, clowes@cmithlaw.com, transwd@aol.com, 
    smackarnes@aol.com, samjstokes@aol.com, joguti@aol.com, 
    bjmackaysmith@hotmail.com, m_larnold@sprynet.com, dwood@rwblaw.com, 
    daveroche@aol.com, milobenn@sirius.com, pwc1@aol.com, 
    candc@ix.netcom.com, eisenbachrl@cooley.com, mwf15@columbia.edu, 
    khuber@hcmwealth.com, doyna@coffeenet.com, katekross@aol.com, 
    mark.langermann@issna.com, martin@sbu.edu, deniz.razon@abbott.com, 
    sras@lycosmail.com, jeff.skilling@enron.com, tskilling@tribune.com, 
    audryn@mindspring.com, mmmmisha@ix.netcom.com, ermak@gte.net
Subject: 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: "Mark Skilling" <markskilling@hotmail.com>
X-To: majalinda@hotmail.com, ksbiehl@hotmail.com, dlmackler@worldnet.att.net, cjones@cityofnapa.org, hazerfen@hotmail.com, meyerjames@usa.net, tomskilljr@aol.com, c.combs@intershop.com, mshachat@aol.com, clowes@email.msn.com, clowes@cmithlaw.com, transwd@aol.com, smackarnes@aol.com, samjstokes@aol.com, joguti@aol.com, bjmackaysmith@hotmail.com, m_larnold@sprynet.com, dwood@rwblaw.com, daveroche@aol.com, milobenn@sirius.com, pwc1@aol.com, candc@ix.netcom.com, eisenbachrl@cooley.com, mwf15@columbia.edu, khuber@hcmwealth.com, doyna@coffeenet.com, katekross@aol.com, mark.langermann@issna.com, martin@sbu.edu, deniz.razon@abbott.com, sras@lycosmail.com, Jeff Skilling, tskilling@tribune.com, audryn@mindspring.com, mmmmisha@ix.netcom.com, ermak@GTE.net
X-cc: 
X-bcc: 
X-Folder: 'Jeffrey_Skilling_Dec2000'Notes Folders'All documents
X-Origin: SKILLING-J
X-FileName: jskillin.nsf
February 10, 1999
I am wakened by the approaching chatter of the early morning call to
prayer (sounding a bit like the fuss made by one of those cartoon balls
of fighting dogs and cats).  From the minarets of far away mosques, the
muezzins' cries ricochet through Istanbul's still dark alleys and
streets.  Seagulls, who have drifted up the hill from the Golden Horn,
squawk contentedly outside my window.  From somewhere down below, a
miserable dog joins into the pre-dawn ruckus, soon followed by the local
muezzin, whose amplified singing drowns out all the rest.  He reminds us
that God is great and that prayer is a whole lot more important than
sleep (at least that's what I've been told; he sings in Arabic).
Because my religion thinks more highly of sleep, I feel free to simply
listen, while gently trying to pull the warm blanket of sleep back over
me.  The  muezzin has a beautiful voice.  Its rise and fall stitches
itself into the edges of a dream (in which a former best friend and I
argue about the rules of a game of miniature golf) hanging just out of
reach.
Slowly, the banal calculations that fill my days begin to crowd their
way into my head.  It's about a quarter to six, I figure, which means
there's time for a bit of writing, or even Turkish vocabulary, before I
douse myself in the shower to full consciousness.  I remind myself of
the theory that one can write most freely while still intoxicated with
sleep (or just plain intoxicated), am immediately stricken with the fear
I am incapable of such freedom, take a look round my brain for something
worth writing about (find nothing), hypothesize about the advantages of
a quick dash into the hallway to turn on the gas heater (so that when I
really get up it will be reasonably warm out there), wonder if I really
do have enough stuff prepared to fill up the two hours of my English
lesson with Suleyman, conclude that all this thinking has probably made
any more sleep impossible, then (I realize later) fall back to sleep.
                               *     *     *
My new phone [(212) 292-6486] is hooked up and I have a new internet
server, which will make it much easier to keep in touch.  Hope to attack
that backlog of responses that are due.
Keep in touch.
Mark-O
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com

谢谢你的帮助。希望这能让其他搜索者受益!:)

这是我目前使用的正则表达式,它满足#1和#2的要求,并返回特定字段的接收对象:

/^(?:To:?(?:'s+)?)((?:(?:(?:(?:[^<>()[']''.,;:'s@'"]+(?:'.[^<>()[']''.,;:'s@'"]+)*)|(?:'".+'"))@(?:(?:'[[0-9]{1,3}'.[0-9]{1,3}'.[0-9]{1,3}'.[0-9]{1,3}'])|(?:(?:[a-zA-Z'-0-9]+'.)+[a-zA-Z]{2,}))),?'s+)+)+/mi;

解决#1和#2的问题后,您可以使用它从您选择的中获取任何电子邮件地址

[_a-z0-9-]+('.[_a-z0-9-]+)*@[a-z0-9-]+('.[a-z0-9-]+)*('.[a-z]{2,4})

这应该抓取任何有效的电子邮件地址并获胜;t抓取无效的,如:example@gmail...com

链接:使用正则表达式验证电子邮件地址

相关文章: