正则表达式抓取括号之间的所有文本,而不是在引号中


Regex grab all text between brackets, and NOT in quotes

我正在尝试匹配{括号}之间的所有文本,但如果它在引号中则不会:例如:

$str = 'value that I {want}, vs value "I do {NOT} want" '

我的结果应该抓住"想要",但省略"不"。我已经拼命地搜索堆栈溢出,寻找可以执行此操作的正则表达式,但没有运气。我已经看到答案允许我获取引号之间的文本,但不能在引号外和括号中获取文本。这可能吗?

如果是这样,它是如何完成的?

到目前为止,这就是我所拥有的:

preg_match_all('/{([^}]*)}/', $str, $matches);

但不幸的是,它只获取括号内的所有文本,包括 {NOT}

一次性完成这项工作非常棘手。我什至想让它与嵌套括号兼容,所以让我们也使用递归模式:

("|').*?'1(*SKIP)(*FAIL)|'{(?:[^{}]|(?R))*'}

好的,让我们解释一下这个神秘的正则表达式:

("|')                   # match eiter a single quote or a double and put it in group 1
.*?                     # match anything ungreedy until ...
'1                      # match what was matched in group 1
(*SKIP)(*FAIL)          # make it skip this match since it's a quoted set of characters
|                       # or
'{(?:[^{}]|(?R))*'}     # match a pair of brackets (even if they are nested)

在线演示

一些 php 代码:

$input = <<<INP
value that I {want}, vs value "I do {NOT} want".
Let's make it {nested {this {time}}}
And yes, it's even "{bullet-{proof}}" :)
INP;
preg_match_all('~("|'').*?'1(*SKIP)(*FAIL)|'{(?:[^{}]|(?R))*'}~', $input, $m);
print_r($m[0]);

示例输出:

Array
(
    [0] => {want}
    [1] => {nested {this {time}}}
)

就个人而言,我会分两次处理这个问题。第一个去掉双引号之间的所有内容,第二个拉出你想要的文本。

也许是这样的:

$str = 'value that I {want}, vs value "I do {NOT} want" ';
// Get rid of everything in between double quotes
$str = preg_replace("/'".*'"/U","",$str);
// Now I can safely grab any text between curly brackets
preg_match_all("/'{(.*)'}/U",$str,$matches);

这里的工作示例:http://3v4l.org/SRnva