SimplePie正则表达式错误


SimplePie regex errors

examplep上的SimplePie的1.3开发版本,带有PHP 5.3。

我可以获得RSS提要并显示它,但我对每一个提取的项目都会出现以下错误:

警告:preg_match()[function.preg match]:编译失败:在第5877行的C:''examplep''htdocs''simplepie.php中偏移562处没有可重复的内容

警告:preg_match()[function.preg match]:编译失败:在第5965 行C:''examplep''htdocs''simplepie.php中偏移509处没有可重复的内容

警告:preg_match()[function.preg match]:编译失败:在第6031行的C:''examplep''htdocs''simplepie.php中偏移509处没有可重复的内容

发生错误的功能:

    /**
 * Parse RFC2822's date format
 *
 * @access protected
 * @return int Timestamp
 */
public function date_rfc2822($date)
{
    static $pcre;
    if (!$pcre)
    {
        $wsp = '['x09'x20]';
        $fws = '(?:' . $wsp . '+|' . $wsp . '*(?:'x0D'x0A' . $wsp . '+)+)';
        $optional_fws = $fws . '?';
        $day_name = $this->day_pcre;
        $month = $this->month_pcre;
        $day = '([0-9]{1,2})';
        $hour = $minute = $second = '([0-9]{2})';
        $year = '([0-9]{2,4})';
        $num_zone = '([+'-])([0-9]{2})([0-9]{2})';
        $character_zone = '([A-Z]{1,5})';
        $zone = '(?:' . $num_zone . '|' . $character_zone . ')';
        $pcre = '/(?:' . $optional_fws . $day_name . $optional_fws . ',)?' . $optional_fws . $day . $fws . $month . $fws . $year . $fws . $hour . $optional_fws . ':' . $optional_fws . $minute . '(?:' . $optional_fws . ':' . $optional_fws . $second . ')?' . $fws . $zone . '/i';
    }
    if (preg_match($pcre, $this->remove_rfc2822_comments($date), $match))

/**
 * Parse RFC850's date format
 *
 * @access protected
 * @return int Timestamp
 */
public function date_rfc850($date)
{
    static $pcre;
    if (!$pcre)
    {
        $space = '['x09'x20]+';
        $day_name = $this->day_pcre;
        $month = $this->month_pcre;
        $day = '([0-9]{1,2})';
        $year = $hour = $minute = $second = '([0-9]{2})';
        $zone = '([A-Z]{1,5})';
        $pcre = '/^' . $day_name . ',' . $space . $day . '-' . $month . '-' . $year . $space . $hour . ':' . $minute . ':' . $second . $space . $zone . '$/i';
    }
    if (preg_match($pcre, $date, $match))

/**
 * Parse C99's asctime()'s date format
 *
 * @access protected
 * @return int Timestamp
 */
public function date_asctime($date)
{
    static $pcre;
    if (!$pcre)
    {
        $space = '['x09'x20]+';
        $wday_name = $this->day_pcre;
        $mon_name = $this->month_pcre;
        $day = '([0-9]{1,2})';
        $hour = $sec = $min = '([0-9]{2})';
        $year = '([0-9]{4})';
        $terminator = ''x0A?'x00?';
        $pcre = '/^' . $wday_name . $space . $mon_name . $space . $day . $space . $hour . ':' . $min . ':' . $sec . $space . $year . $terminator . '$/i';
    }
    if (preg_match($pcre, $date, $match))

错误引用的行是每个函数的最后一个if表达式(您可以在此处看到完整的代码)。

我认为每个函数的$pcre中都有一个糟糕的正则表达式。

感谢

如果正则表达式有任何问题,就不应该编译
但是,$this->day_pcre$this->month_pcre可能包含可能使正则表达式变差的元字符。最好检查一下。

我用"Mon"answers"Oct"代替,并在Ideone上运行。似乎有效。

顺便说一句,你可能想兑换$fws-

来自$fws = '(?:' . $wsp . '+|' . $wsp . '*(?:'x0D'x0A' . $wsp . '+)+)'
$fws = '(?:(?:(?:'x0D'x0A)?' . $wsp . ')+)'

因为它们是等效的并且可能更高效。

在函数中,应该打印出
的正则表达式$day/$month/$pcre变量。您还能指望如何调试它?

可能是别的原因,我不知道。

以下是我得到的:http://ideone.com/zJ5vE

代码

<?php
date_asctime( "Mon Oct 21 11:21:31 2012'x0A" );
date_asctime( "Mon Oct 22 12:22:32 2012'x0A" );
date_asctime( "Mon Oct 23 13:23:33 2012'x0A" );
print("=================='n");
date_rfc2822( 'Mon, 21 Oct 2012 21:01 -1011' );
date_rfc2822( 'Mon, 22 Oct 2012 22:02 -1012' );
date_rfc2822( 'Mon, 23 Oct 2012 23:03 -1013' );

/**
 * Parse C99's asctime()'s date format
 *
 * @access protected
 * @return int Timestamp
 */
function date_asctime($date)
{
    static $pcre;
    if (!$pcre)
    {
        $space = '['x09'x20]+';
        $wday_name = 'Mon';  //$this->day_pcre;
        $mon_name = 'Oct';   //$this->month_pcre;
        $day = '([0-9]{1,2})';
        $hour = $sec = $min = '([0-9]{2})';
        $year = '([0-9]{4})';
        $terminator = ''x0A?'x00?';
        $pcre = '/^' . $wday_name . $space . $mon_name . $space . $day . $space . $hour . ':' . $min . ':' . $sec . $space . $year . $terminator . '$/i';
    }
    if (preg_match($pcre, $date, $match))
    {
       print_r($match);
    }
}

/**
 * Parse RFC2822's date format
 *
 * @access protected
 * @return int Timestamp
 */
function date_rfc2822($date)
{
    static $pcre;
    if (!$pcre)
    {
        $wsp = '['x09'x20]';
         // $fws = '(?:' . $wsp . '+|' . $wsp . '*(?:'x0D'x0A' . $wsp . '+)+)';
        $fws = '(?:(?:(?:'x0D'x0A)?' . $wsp . ')+)';
        $optional_fws = $fws . '?';
        $day_name = 'Mon';  //$this->day_pcre;
        $month = 'Oct';     //$this->month_pcre;
        $day = '([0-9]{1,2})';
        $hour = $minute = $second = '([0-9]{2})';
        $year = '([0-9]{2,4})';
        $num_zone = '([+'-])([0-9]{2})([0-9]{2})';
        $character_zone = '([A-Z]{1,5})';
        $zone = '(?:' . $num_zone . '|' . $character_zone . ')';
        $pcre = '/(?:' . $optional_fws . $day_name . $optional_fws . ',)?' . $optional_fws . $day . $fws . $month . $fws . $year . $fws . $hour . $optional_fws . ':' . $optional_fws . $minute . '(?:' . $optional_fws . ':' . $optional_fws . $second . ')?' . $fws . $zone . '/i';
    }
    // if (preg_match($pcre, $this->remove_rfc2822_comments($date), $match))
    if (preg_match($pcre, $date, $match))
    {
       print_r($match);
    }
} 
?>

输出

Array
(
    [0] => Mon Oct 21 11:21:31 2012
    [1] => 21
    [2] => 11
    [3] => 21
    [4] => 31
    [5] => 2012
)
Array
(
    [0] => Mon Oct 22 12:22:32 2012
    [1] => 22
    [2] => 12
    [3] => 22
    [4] => 32
    [5] => 2012
)
Array
(
    [0] => Mon Oct 23 13:23:33 2012
    [1] => 23
    [2] => 13
    [3] => 23
    [4] => 33
    [5] => 2012
)
==================
Array
(
    [0] => Mon, 21 Oct 2012 21:01 -1011
    [1] => 21
    [2] => 2012
    [3] => 21
    [4] => 01
    [5] => 
    [6] => -
    [7] => 10
    [8] => 11
)
Array
(
    [0] => Mon, 22 Oct 2012 22:02 -1012
    [1] => 22
    [2] => 2012
    [3] => 22
    [4] => 02
    [5] => 
    [6] => -
    [7] => 10
    [8] => 12
)
Array
(
    [0] => Mon, 23 Oct 2012 23:03 -1013
    [1] => 23
    [2] => 2012
    [3] => 23
    [4] => 03
    [5] => 
    [6] => -
    [7] => 10
    [8] => 13
)