需要使用pregmatch从字符串中提取日期


Need to extract a date from a string using pregmatch

我有这个字符串

9月1日,提供设施齐全的独立式两居室套房,距离UVIC仅5分钟步行路程。

现在我使用pregmatch来提取它:这是regex。

'/'bavailable''s(?P<date_available>[?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?|immediately]+[''s'd]+)[st|nd|rd|th]?/i'

目前,这个正则表达式可以从字符串中提取:

Available september 1st.
Available September 2nd
available september 3rd
available september 4th
available sept 1

输出示例为:

Array
(
    [0] => available September 1
    [date_available] => September 1
    [1] => September 1
)

但当字符串为时,我找不到提取的方法

Available for september 1st.
Available in September 2nd
available since september 3rd
available at september 4th

有人能帮我处理这个吗?感谢

使用通配符A-Z,2到5个字母(与"on"等匹配):

$regex = '/'bavailable[ ]*(?:[a-z]{2,5})?[ ]*' .
    '(?P<date_available>immediately|now|' .
    '(?:(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?' .
    '|Apr(?:il)?|May|Jun(?:e)|Jul(?:y)?|Aug(?:ust)?' .
    '|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)' .
    '[ ]+['d]+))' .
    //end <date_available>
    '(?:st|nd|rd|th)?/i';

用法:

$lines = array(
    'Fully furnished self contained 2 bedroom suite just 5 minute walk to UVIC is available now.',
    'bedroom suite just 5 minute walk to UVIC is available on September 34.',
    'bedroom suite just 5 minute walk to somewhere is available on Apr 1.',
    );
foreach ($lines as $line) {
    echo $line, "'n<br>'n";
    if (preg_match($regex, $line, $matches) === 1) {
        print_r($matches['date_available']);
    } else {
        echo "Does not match.";
    }
    echo "'n<br>'n";
}

以下内容适用于您的所有示例,尽管我还没有在PHP中放入您的"命名子模式",因为我不知道它们的确切语法

'bavailable's+(?:(?:for|in|at|since)'s+)?((?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sept(?:ember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)'s+'d{1,2}(?:st|nd|rd|th)?)

我实际上根本无法让你的工作,看起来你试图使用带有方括号[ ]的字符类,而不是分组并与括号( )交替使用。

根据你的要求,以下可能是我能得到的最短的

$pattern = '/'bavailable's+(?:(?:for|in|at|since)'s+)?((?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)'s+?'d{1,2}(?:st|nd|rd|th)?)/i';

这不包括命名的子模式,因为所需的匹配将始终在$matches[1]中。但是,如果您想包括命名子模式,则可以始终在中放入一个子模式

$pattern = '/'bavailable's+(?:(?:for|in|at|since)'s+)?(?P<date_available>(?:immediately|now)|(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Oct(?:ober)?|(?:Sept|Nov|Dec)(?:ember)?)'s+?'d{1,2}(?:st|nd|rd|th)?)/i';

作为对@EthanB早期解决方案的回应,您似乎没有捕获日期st, nd, rd, th的序号后缀,如果是这样,并且不需要它,那么您可以通过不包括它来缩短它,尝试匹配日期之后的任何内容都没有意义。