正则表达式匹配，仅提取所需的字符串段 - Regular expression match, extracting only wanted segments of string

我正在尝试从字符串中提取三个段。由于我不是特别擅长正则表达式，我认为我所做的可能会做得更好。

我想提取以下字符串的粗体部分：

一些文本： ANYTHING_HERE （旧=ANYTHING_HERE，新=ANYTHING_HERE）

一些例子可能是：

ABC：Some_Field（旧=，新=123）

ABC：Some_Field（旧=ABCde，新=1234）

ABC：Some_Field（旧=你好世界，新=再见世界）

因此，上述内容将返回以下匹配项：

$matches[0] = 'Some_Field';
$matches[1] = '';
$matches[2] = '123';

到目前为止，我有以下代码：

preg_match_all('/^([a-z]*':('s?)+)(.+)('s?)+'(old=(.+)',('s?)+new=(.+)')/i',$string,$matches);

上面的问题是它为字符串的每个单独段返回匹配项。我不知道如何使用正则表达式确保字符串是正确的格式，而无需捕获和存储匹配项，如果这有意义？

那么，我

的问题，如果还不清楚，我如何从上面的字符串中检索我想要的段？

你不需要preg_match_all .您可以使用此preg_match调用：

$s = 'SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2)';
if (preg_match('/[^:]*:'s*('w*)'s*'(Old=('w*),'s*New=('w*)/i', $s, $arr))
   print_r($arr);

输出：

Array
(
    [0] => SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2
    [1] => ANYTHING_HERE
    [2] => ANYTHING_HERE1
    [3] => ANYTHING_HERE2
)

if(preg_match_all('/([a-z]*)':'s*.+'(Old=(.+),'s*New=(.+)')/i',$string,$matches)) {
    print_r($matches);
}

例：

$string = 'ABC: Some_Field (Old=Hello World,New=Bye Bye World)';

将匹配：

Array
(
    [0] => Array
        (
            [0] => ABC: Some_Field (Old=Hello World,New=Bye Bye World)
        )
    [1] => Array
        (
            [0] => ABC
        )
    [2] => Array
        (
            [0] => Hello World
        )
    [3] => Array
        (
            [0] => Bye Bye World
        )
)

问题是您使用的括号比您需要的要多，因此捕获的输入段比您希望的要多。

例如，每个('s?)+段都应该是's*

您正在寻找的正则表达式是：

[^:]+:'s*(.+)'s*'(old=(.*)'s*,'s*new=(.*)')

在 PHP 中：

preg_match_all('/[^:]+:'s*(.+)'s*'(old=(.*)'s*,'s*new=(.*)')/i',$string,$matches);

可以在这里找到一个有用的工具：http://www.myregextester.com/index.php

该工具提供了一个"解释"复选框（以及您想要选择的"PHP"复选框和"i"标志复选框），它还提供了正则表达式的完整说明。为了后人，我也在下面进行了解释：

NODE                     EXPLANATION
----------------------------------------------------------------------
(?i-msx:                 group, but do not capture (case-insensitive)
                         (with ^ and $ matching normally) (with . not
                         matching 'n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  [^:]+                    any character except: ':' (1 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  's*                      whitespace ('n, 'r, 't, 'f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  (                        group and capture to '1:
----------------------------------------------------------------------
    .+                       any character except 'n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of '1
----------------------------------------------------------------------
  's*                      whitespace ('n, 'r, 't, 'f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  '(                       '('
----------------------------------------------------------------------
  old=                     'old='
----------------------------------------------------------------------
  (                        group and capture to '2:
----------------------------------------------------------------------
    .*                       any character except 'n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of '2
----------------------------------------------------------------------
  's*                      whitespace ('n, 'r, 't, 'f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  ,                        ','
----------------------------------------------------------------------
  's*                      whitespace ('n, 'r, 't, 'f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  new=                     'new='
----------------------------------------------------------------------
  (                        group and capture to '3:
----------------------------------------------------------------------
    .*                       any character except 'n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of '3
----------------------------------------------------------------------
  ')                       ')'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

像这样^_^更简单的东西呢

[:=]'s*(['w's]*)

现场演示

:'s*([^('s]+)'s*'(Old=([^,]*),New=([^)]*)

现场演示

另外，请告知您是否需要解释。