我有这个字符串
9月1日,提供设施齐全的独立式两居室套房,距离UVIC仅5分钟步行路程。
现在我使用pregmatch来提取它:这是regex。
'/'bavailable''s(?P<date_available>[?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?|immediately]+[''s'd]+)[st|nd|rd|th]?/i'
目前,这个正则表达式可以从字符串中提取:
Available september 1st.
Available September 2nd
available september 3rd
available september 4th
available sept 1
输出示例为:
Array
(
[0] => available September 1
[date_available] => September 1
[1] => September 1
)
但当字符串为时,我找不到提取的方法
Available for september 1st.
Available in September 2nd
available since september 3rd
available at september 4th
有人能帮我处理这个吗?感谢
使用通配符A-Z,2到5个字母(与"on"等匹配):
$regex = '/'bavailable[ ]*(?:[a-z]{2,5})?[ ]*' .
'(?P<date_available>immediately|now|' .
'(?:(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?' .
'|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?' .
'|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)' .
'[ ]+['d]+))' .
//end <date_available>
'(?:st|nd|rd|th)?/i';
用法:
$lines = array(
'Fully furnished self contained 2 bedroom suite just 5 minute walk to UVIC is available now.',
'bedroom suite just 5 minute walk to UVIC is available on September 34.',
'bedroom suite just 5 minute walk to somewhere is available on Apr 1.',
);
foreach ($lines as $line) {
echo $line, "'n<br>'n";
if (preg_match($regex, $line, $matches) === 1) {
print_r($matches['date_available']);
} else {
echo "Does not match.";
}
echo "'n<br>'n";
}
以下内容适用于您的所有示例,尽管我还没有在PHP中放入您的"命名子模式",因为我不知道它们的确切语法
'bavailable's+(?:(?:for|in|at|since)'s+)?((?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)'s+'d{1,2}(?:st|nd|rd|th)?)
我实际上根本无法让你的工作,看起来你试图使用带有方括号[ ]
的字符类,而不是分组并与括号( )
交替使用。
根据你的要求,以下可能是我能得到的最短的
$pattern = '/'bavailable's+(?:(?:for|in|at|since)'s+)?((?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)'s+?'d{1,2}(?:st|nd|rd|th)?)/i';
这不包括命名的子模式,因为所需的匹配将始终在$matches[1]
中。但是,如果您想包括命名子模式,则可以始终在中放入一个子模式
$pattern = '/'bavailable's+(?:(?:for|in|at|since)'s+)?(?P<date_available>(?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)'s+?'d{1,2}(?:st|nd|rd|th)?)/i';
作为对@EthanB早期解决方案的回应,您似乎没有捕获日期st, nd, rd, th
的序号后缀,如果是这样,并且不需要它,那么您可以通过不包括它来缩短它,尝试匹配日期之后的任何内容都没有意义。