在PHP中使用preg_match用多个分隔符分隔字符串


Splitting a string by multiple separators with preg_match in PHP

字符串最多由三个部分组成:WriterDirectorProducer。我们称它们为"类别"。每个类别由用冒号分隔的两部分组成:Label : Names,其中Label是提到的类别名称之一,Names是由斜杠分隔的名称列表。例如:

Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith

我想通过类别名称和preg_match函数的名称列表将字符串分成部分。以下是目前为止的内容:

$pattern = '/Writer : (?P<Writer>['s'S]+?)Director : (?P<Director>['s'S]+?)Producer : (?P<Producer>['s'S]+)/';
$sentence = 'Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith';
preg_match($pattern, $sentence, $matches);
foreach($matches as $cat => $match) {
  // Do more
  // echo "<b>" . $cat . "</b>" . $match . "<br />";
}

如果字符串中恰好有这三个类别,则脚本运行良好。如果至少缺少一个类别,则会失败。

一种方法是使用众所周知的?量词创建可选组:

$pattern = '/^' .
  '(?:Writer *: *(?P<Writer>[^:]+))?' .
  '(?:Director *: *(?P<Director>[^:]+))?' .
  '(?:Producer *: *(?P<Producer>[^:]+))?' .
  '$/';
preg_match($pattern, $sentence, $matches);

其中(?:)创建了一个非捕获组。注意,输出数组将通过数字位置索引和名称进行索引,例如:

Array
(
    [0] => Writer : Jeffrey Schenck / Peter Sullivan / Director : Brian Trenchard-Smith / jack / Producer : smith
    [Writer] => Jeffrey Schenck / Peter Sullivan / 
    [1] => Jeffrey Schenck / Peter Sullivan / 
    [Director] => Brian Trenchard-Smith / jack / 
    [2] => Brian Trenchard-Smith / jack / 
    [Producer] => smith
    [3] => smith
)
另一种方法是使用preg_match_all进行额外的处理:
$pattern = '/(?<=:)[^:]+/';
if (preg_match_all($pattern, $sentence, $matches)) {
  $keys = ['Writer', 'Director', 'Producer'];
  for ($i = 0; $i < count($matches[0]); ++$i)
    // The isset() checks are skipped for clarity's sake
    $a[$keys[$i]] = $matches[0][$i];
  print_r($a);
}

,其中(?<=:):字符的正向后看断言。在本例中,生成的数组将有一个整洁的外观:

Array
(
    [Writer] =>  Jeffrey Schenck / Peter Sullivan / Director 
    [Director] =>  Brian Trenchard-Smith / jack / Producer 
    [Producer] =>  smith
)