Preg_replace替换错误


preg_replace replacing incorrectly

我从一个朋友那里得到了一些帮助(他现在正在度假),但我有一个preg_replace搜索和替换的问题。我不知道为什么,但是它错误地替换了字符串,这对它应该替换的下一个字符串产生了影响。

这基本上是在模板类中处理'if'和'else'查询。

function if_statement($a, $b, $if, $type, $else = NULL){
    if($type == "1" && is_numeric($a) && is_numeric($b)){
        $statement = ($a === $b) ? $if : $else;
    } else if($type == "1"){
        $statement = ($a == $b) ? $if : $else;
    } else if($type == "2"){
        $statement = ($a != $b) ? $if : $else;
    }
    return stripslashes($statement);
}
$output = file_get_contents("template.tpl");
$replace = array(
  '#'<if:"''(.*?)'' == ''(.*?)''"'>(.*?)'<else'>(.*?)'<'/endif'>#sei',
  '#'<if:"''(.*?)'' == ''(.*?)''"'>(.*?)'<'/endif'>#sei'
);  
$functions = array(
  "if_statement('''1', '''2', '''3', '1', '''4')",
  "if_statement('''1', '''2', '''3', '1')"
);
$output = preg_replace($replace, $functions, $output);
echo $output;

模板:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>
    <if:"'1' == '2'">1 equals 2!<else>1 doesn't equal 2</endif>
</body>
</html>

当前输出将在下面:

<HTML>
    <head>
    <meta http-equiv="content-type" content="application/xhtml+xml; charset=UTF-8" />
    <title>Site Title</title>
    <link rel="stylesheet" type="text/css" media="screen" href="common.css" />
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    **</endif>**
</head>
<body>
    **<if:"'{TODAY}' == 'Monday'">**Today is Monday
    1 doesn't equal 2
</body>
</html>

在上面,粗体/astrix标记的部分不应该在输出中,而且今天也不是星期一。当管理员登录时,admin-bar.css文件已被正确包含,但由于某种原因没有拾取</endif>标签-事实上,它看起来像是在<else>标签之后,而不是在下一个语句中…换句话说,preg_replace匹配了一个错误的东西!因此没有发现第二个<if>语句。

{BRACKET}标签被正确替换-我甚至手动将数据放入语句(只是为了检查),所以它们不是问题…

我不知道为什么,但对我来说,preg_replace没有找到正确的序列来替换和采取行动。如果有人能给我一双新鲜的眼睛/伸出一只手,我将不胜感激。

谢谢!

示例中的第一个<if>没有<else>子句。因此,当<if:"'(.*?)' == '(.*?)'">(.*?)<else>(.*?)</endif>(其中<else>不是可选的)应用于它时,它匹配所有这些:

    <if:"'{ISADMIN}' == '1'">
        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday<else>Today is not Monday</endif>

在匹配中,组$3

        <link rel="stylesheet" href="admin-bar.css" type="text/css" media="all" />
    </endif>
</head>
<body>
    <if:"'{TODAY}' == 'Monday'">Today is Monday

您可以通过使用向前看断言禁止regex跨越</endif>来避免这种情况:

'%<if:'s*"''([^'']*)'' == ''([^'']*)''">((?:(?!<else>|</endif>).)*)<else>((?:(?!</endif).)*)</endif>%si'

或者,以注释的形式(当程序员再次"外出度假"时可能更有用):

'%<if:'s*"''     # Match <if:(optional space)"''
    ([^'']*)     # Match 0 or more non-quote characters, capture group 1
    '''s=='s''   # Match '' == ''
    ([^'']*)     # Match 0 or more non-quote characters, capture group 2
    ''">         # Match ''">
    (            # Capture into group 3:
     (?:         # The following group...
      (?!        # only if we''re not right before...
       <else>    # <else>
      |          # or
       </endif>  # </endif>
      )          # (End of lookahead assertion)
      .          # Match any character
     )*          # Repeat as necessary
    )            # End of capturing group 3
    <else>       # Match <else>
    (            # Same construction as above, group 4
     (?:
      (?!
       </endif>  # this time only looking for </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

第二个正则表达式也应该改进:

'%<if:'s*"''     # Match <if:(optional space)"''
    ([^'']*)     # Match 0 or more non-quote characters, capture group 1
    '''s=='s''   # Match '' == ''
    ([^'']*)     # Match 0 or more non-quote characters, capture group 2
    ''">         # Match ''">
    (            # Capture into group 3:
     (?:
      (?!
       </endif>  # Any text until </endif>
      )
      .
     )*
    )
    </endif>     # and finally match </endif>
    %esix'

此外,这些正则表达式应该更快,因为它们更清楚地指定了什么可以匹配,什么不可以匹配,从而避免了大量的回溯。