正则表达式通配符和环顾四周


Regex wildcard and lookaround

我有一个日期名册,想要捕获从日期(01 Aug SA)一直到下一个日期开始的信息EG ( 31 七月 FR 0135 HKG MEL 33K 18:00 19:44 ) 在第一个实例和 ( 01 八月 SA 06:40 07:10 11:10 ) 在下一个实例中。

我拥有的正则表达式是向前看的,但我不知道该怎么做。

regex:: $pattern = '/.*(?='d{2}'s+[A-Z][a-z]{2}'s+'w{2})/';

31 七月 FR 0135 HKG MEL 33K 18:00 19:44 01 八月 SA
06:40 07:10 11:10 02 8月 SU 0134 MEL HKG 33K 06:40 07:37 15:21 15:51 11:11 03 8月 MO G 04 8月 TU 0905 HKG MNL 330 20:50 22:52 05 八月 我们 00:58
0912 MNL HKG 330 08:32 10:36 11:06 14:16 06 8月 TH T2024
P2 19:00 19:00 07 8月 周五 00:30 00:30 05:30
T2314 2R 22:00 22:00 08 8月 SA 06:00 06:00 08:00 09 八月 苏 G 10 八月 莫 G 11 八月 TU R13 06:00 06:00 06:06 06:06 00:06 0699 HKG BOM
33G 16:20 17:53 21:07 21:37 07:47 12 八月 我们 13 8月 TH RPT 00:00 00:00 02:45 02:45 02:45 14 8月 FR 0660 BOM HKG 33G 03:05 04:15 12:48 13:18 07:43 15 8月 SA R13
06:00 06:00 11:23 11:23 05:23 T1514 CW 14:00 14:00 19:30 19:30 05:30 16 8月 SU 0494 HKG TPE 33G 10:30 12:16 14:08
0495 热塑性弹性体 HKG 33G 14:59 16:43 17:13 06:43 17 八月 MO O
18 8月 TU G 19 8月 WE 0697 HKG DEL 33G 19:05 21:27 20 8 月 TH 00:28 00:58 08:23 21 8月 FR 0694 DEL HKG 33K 01:30 02:30 10:45 11:15 07:15 MED HC
22 8月 SA G 23 8月 SU G 24 8月 MO G 25 8月 TU 0767 HKG SGN 33E 07:30 08:40 10:15 0766 SGN HKG 33E 11:20 15:10 15:40 08:10 8月26日 WE G
8月27日 TH G 28 8月 FR 0699 HKG BOM 33G 16:20 17:30 21:20 21:50 08:00 29 8月 SA 0696 BOM HKG 33G 21:30 22:30 8月30日 周日 07:00 07:30 07:30 31 8月 MO 0564 HKG KIX 330 12:00 13:10 14:55 0564 TPE KIX 330
16:05

20:00 20:30 07:30

试试这个正则表达式

(?='d{2}'s+[A-Z][a-z]{2}'s+'w{2})(.+?)(?:(?='d{2}'s+[A-Z][a-z]{2}'s+'w{2})|$)

正则表达式演示

正则表达式细分

(?='d{2}'s+[A-Z][a-z]{2}'s+'w{2}) #This is the same thing that you have given in question as regex
  (.+?) #Lazy matching to match all the characters between the present lookahead till the next lookahead. If we use greedy matching, it will match all the way to the last which we don't want
(?:  #Non-capturing group..I am using because of 'z
  (?='d{2}'s+[A-Z][a-z]{2}'s+'w{2}) #Same lookahead as yours..It basically asserts the next position of the pattern you are searching..We want to capture whatever comes in between the two lookaheads
   | #Alternation
 $ #This is for capturing the last match because there will be no lookahead that follows the last one
)

注意:- 我正在使用s修饰符,因为我们也想匹配'n如果它在那里

PHP代码

$re = "/(?=''d{2}''s+[A-Z][a-z]{2}''s+''w{2})(.+?)(?:(?=''d{2}''s+[A-Z][a-z]{2}''s+''w{2})|$)/s"; 
$str = "31 Jul FR 0135 HKG MEL 33K 18:00 19:44 01 Aug SA'n06:40 07:10 11:10 02 Aug SU 0134 MEL HKG 33K 06:40 07:37 15:21 15:51 11:11 03 Aug MO G 04 Aug TU 0905 HKG MNL 330 20:50 22:52 05 Aug WE 00:58'n0912 MNL HKG 330 08:32 10:36 11:06 14:16 06 Aug TH T2024'nP2 19:00 19:00 07 Aug FR 00:30 00:30 05:30'nT2314 2R 22:00 22:00 08 Aug SA 06:00 06:00 08:00 09 Aug SU G 10 Aug MO G 11 Aug TU R13 06:00 06:00 06:06 06:06 00:06 0699 HKG BOM'n33G 16:20 17:53 21:07 21:37 07:47 12 Aug WE 13 Aug TH RPT 00:00 00:00 02:45 02:45 02:45 14 Aug FR 0660 BOM HKG 33G 03:05 04:15 12:48 13:18 07:43 15 Aug SA R13'n06:00 06:00 11:23 11:23 05:23 T1514 CW 14:00 14:00 19:30 19:30 05:30 16 Aug SU 0494 HKG TPE 33G 10:30 12:16 14:08'n0495 TPE HKG 33G 14:59 16:43 17:13 06:43 17 Aug MO O'n18 Aug TU G 19 Aug WE 0697 HKG DEL 33G 19:05 21:27 20 Aug TH 00:28 00:58 08:23 21 Aug FR 0694 DEL HKG 33K 01:30 02:30 10:45 11:15 07:15 MED HC'n22 Aug SA G 23 Aug SU G 24 Aug MO G 25 Aug TU 0767 HKG SGN 33E 07:30 08:40 10:15 0766 SGN HKG 33E 11:20 15:10 15:40 08:10 26 Aug WE G'n27 Aug TH G 28 Aug FR 0699 HKG BOM 33G 16:20 17:30 21:20 21:50 08:00 29 Aug SA 0696 BOM HKG 33G 21:30 22:30 30 Aug SU 07:00 07:30 07:30 31 Aug MO 0564 HKG TPE 330 12:00 13:10 14:55 0564 TPE KIX 330'n16:05 20:00 20:30 07:30"; 
preg_match_all($re, $str, $matches);

Ideone 演示