根据3个连续(合格)单词的第一个字符在字符串中进行匹配


Match within string based on 1st chars of 3 consecutive (qualifying) words

我有一个blob文本,像这样:

胜利天使,广场酒店对面,59街街。我答应过要提醒多疑的哈里森注意工作的美德,只是我们发现它现在隐藏在一个巨大的米色盒子,用于修复& &;网站。她很激动。

我正在尝试形成一个正则表达式,允许我搜索(和替换)针对此文本,其中有一个匹配针对三个连续单词的第一个字母(大小写不敏感和字母顺序)。

例如:假设我有这3个字符:v, af

对示例应用神奇的正则表达式;它将返回Victory, across from

进一步例子:

  • s i h将匹配Street. I had
  • s w t将匹配She was thrilled
  • t w v将匹配the work’s virtues
  • a r s将匹配a restoration & site

第四个例子可能太复杂了,因为它本质上需要忽略不以字母字符开头,但在任何结果中都包含它们的单词。

返回匹配项后,我计划使用它来替换较大示例中的文本。

我也开放非正则表达式的解决方案

<?php
$content = 'with an Angel of Victory, across from the Plaza Hotel, on Fifty-ninth Street. I had promised to alert the skeptical Harrison to the work’s virtues, but we found that it is now hidden in a huge beige box, for a restoration & site. She was thrilled.';
$characters = ['v', 'a', 'f'];
$patterns = [];
foreach ($characters as $character) {
    $patterns[] = sprintf('(%s[^'s]*)', preg_quote($character));
}
$regex = sprintf('~'b%s'b~i', implode(''s', $patterns));
preg_match($regex, $content, $matches);
print_r($matches);

我肯定有更好的方法来做这件事。在这里,您将得到一个表达式,如

'b        #word boundary
(v[^'s]*) #match first occurance of v until a space.
's        #space
(a[^'s]*) #match first occurance of a until a space.
's        #space
(f[^'s]*) #match first occurance of f until a space.
'b        #word boundary

应该会得到类似

的内容
(
    [0] => Victory, across from
    [1] => Victory,
    [2] => across
    [3] => from
)

第四个场景regex:(我将把它留给你来分解)

~'b(a[^'s]*)'s(&[a-z]+;'s*)?(r[^'s]*)'s(&[a-z]+;'s*)?(s[^'s]*)'s(&[a-z]+;'s*)?