PHP正则表达式preg_match_all()介于[[和]]之间


PHP regex preg_match_all() between [[ and ]]

我想使用preg_match_all()来提取[[和]]之间的内容,但忽略[[和]]],因此例如此文本:

$text = <<<TEXT
Some text going here
[[ 1. this is a text ]]
another text but multiple lines
[[ 2. this 
is a 
text ]]
This should be ignored, haveing 3 on the left
[[[ 3. this is a text ]]
This should be ignored, haveing 3 on the right
[[ 4. this is a text ]]]
This should be ignored, haveing 3 both on the left and right
[[[ 5. this is a text ]]]
This is the final sentence.
[[ 6. this is a text ]]
TEXT;
if (preg_match_all("(?!<'[)('['[.*?']'])(?!'[)", $text, $tags, PREG_PATTERN_ORDER)) {
        $tags = $tags[0];
}
echo '<pre>';
print_r(tags);
echo '</pre>';

因此,只选择1.、2.和6。但是,我在上面尝试的regex选择了除2.之外的所有内容,没有按预期工作。

您可以使用以下模式:

preg_match_all('~(?<!'[)'['[(?!'[)([^]]*)]](?!])~', $text, $tags);

备注:
无需指定PREG_PATTER_ORDER,因为它是PREG_match*函数的默认集合
我为方括号内的内容添加了捕获括号,如果不需要,可以删除它们
如果标签内不允许使用方括号,则图案可以缩短为:

~(?<!'[)'['[([^][]*)]](?!])~

这里有一个正则表达式可以完成这项工作:

((?<!'[)'['[([^'[][^']]*)']'](?!']))

正则表达式101

分解

  • 任何未由[
  • [[
  • 任何字符[
  • 任何字符,但]0次或更多次
  • ]]
  • 后面不跟a]

这应该是防弹的,除非它需要至少一个介于[[和]]之间的字符。

尝试:

preg_match_all('/('A|[^[])'[{2}[^[](?<content>[^]]+)[^]]']{2}([^]]|'z)/s', ...)

http://regex101.com/r/jC2mM0

http://codepad.viper-7.com/bbs3oR

Array
(
    [0] => Array
        (
            [0] => 
[[ 1. this is a text ]]
            [1] => 
[[ 2. this 
is a 
text ]]
            [2] => 
[[ 6. this is a text ]]
        )
    [1] => Array
        (
            [0] => 1. this is a text
            [1] => 2. this 
is a 
text
            [2] => 6. this is a text
        )
    [2] => Array
        (
            [0] => 
            [1] => 
            [2] => 
        )
)