正则表达式:一种搜索字符串的模式，除非它是一组字符之间的内容 - Regex: A pattern that searches for a string except when it is the content between a set of characters

Regex: A pattern that searches for a string except when it is the content between a set of characters

我正在寻找一个正则表达式，该表达式表示字符串中特定子字符串（字母数字：空格，数字，符号，字母）的所有出现，除非它在两组字符之间。

例如：

This is a string that contains multiple <span class="string">occurrences</span> of the word string.

我希望能够检索单词字符串的第一个和最后一个出现，但不能检索第二个，因为它介于span和span之间。

这应该可以做到。

$string = 'This is a "string" that contains multiple <span class="string">occurrences</span> of the word string.';
$target = 'string';
preg_match_all('~<.+?>.*?</.+?>(*SKIP)(*FAIL)|(' . preg_quote($target) . ')~', $string, $matches);
echo 'Found:' . count($matches[1]) . ' occurances of ' . $target . '.';

输出：

发现：2 次出现字符串。

http://www.rexegg.com/regex-best-trick.html

演示（附说明）：https://regex101.com/r/yG2dS3/1

我还稍微更改了您提供的字符串（"string"而不是string开始），因为我的第一个正则表达式可以工作，但不应该这样做，所以我也引用了元素外部的字符串。

我不确定为什么第一个?和之后的文本是黑色而不是红色，但这个例子对我有用。可以看到它在这里工作，http://sandbox.onlinephpfunctions.com/code/9c97f4c257bc8cb09f4da14db34727d27bde0181 也是如此。

您可以尝试使用此正则表达式：

(<string('s|'S)*?<'/string>)|(<'/?('s|'S)*?>)它可以很好地检测 HTML 元素中的单词"字符串"。尝试使用 http://regexr.com/看看它是如何工作的。

编辑：

如果你想为每个HTML元素（如<script>，<div id="hello">等）执行此操作，您可以使用这个：

<('/*?)(?!(em|p|br's*'/|strong))'w+?.+?>