我已经试着思考了很长一段时间,但仍然没有找到解决方案。
我正在研究一些简单的格式化方法,其中我想要一些包含括号内字符串的标记,并在括号前定义标记。标签也应该能够放在其他括号内。
字符串:
This is some random text, tag1{while this is inside a tag2{tag}}. This is some
other text tag2{also with a tag tag3{inside} of it}.
我现在想做的是每个的内容
tag1{}
tag2{}
tag3{}
我发现其他人也有类似的问题(使用正则表达式查找匹配的括号),但他们的问题更多地集中在如何在其他括号中查找匹配的方括号上,而我的问题是两者兼而有之,以及在较长的文本中查找乘法括号。
如果标签总是平衡的,可以使用这样的表达式来获取所有标签的内容和名称,包括嵌套标签。
'b('w+)(?={((?:[^{}]+|{(?2)})*)})
示例:
$str = "This is some random text, tag1{while this is inside a tag2{tag}}. This is some other text tag2{also with a tag tag3{inside} of it}.";
$re = "/''b(''w+)(?={((?:[^{}]+|{(?2)})*)})/";
preg_match_all($re, $str, $m);
echo "* Tag names:'n";
print_r($m[1]);
echo "* Tag content:'n";
print_r($m[2]);
输出:
* Tag names:
Array
(
[0] => tag1
[1] => tag2
[2] => tag2
[3] => tag3
)
* Tag content:
Array
(
[0] => while this is inside a tag2{tag}
[1] => tag
[2] => also with a tag tag3{inside} of it
[3] => inside
)
我不知道,如果有一个regexp,它可以在一个调用中获得所有内部和外部标记,但您可以从链接的问题中使用这个regexp /'{(([^'{'}]+)|(?R))*'}/
,并递归迭代到结果中。
为了更清楚起见,我在正则表达式中添加了您的标签名称和一些命名的子模式:
function search_tags($string, $recursion = 0) {
$Results = array();
if (preg_match_all("/(?<tagname>['w]+)'{(?<content>(([^'{'}]+)|(?R))*)'}/", $string, $matches, PREG_SET_ORDER)) {
foreach ($matches as $match) {
$Results[] = array('match' => $match[0], 'tagname' => $match['tagname'], 'content' => $match['content'], 'deepness' => $recursion);
if ($InnerResults = search_tags($match['content'], $recursion+1)) {
$Results = array_merge($Results, $InnerResults);
}
}
return $Results;
}
return false;
}
这将返回一个数组,其中包含所有匹配项,包括整个匹配项、标记名称、括号内容和迭代计数器,显示匹配项嵌套在其他标记中的频率。我为您的字符串添加了另一个嵌套级别以进行演示:
$text = "This is some random text, tag1{while this is inside a tag2{tag}}. This is some other text tag3{also with a tag tag4{and another nested tag5{inside}} of it}.";
echo '<pre>'.print_r(search_tags($text), true).'</pre>';
输出为:
Array
(
[0] => Array
(
[match] => tag1{while this is inside a tag2{tag}}
[tagname] => tag1
[content] => while this is inside a tag2{tag}
[deepness] => 0
)
[1] => Array
(
[match] => tag2{tag}
[tagname] => tag2
[content] => tag
[deepness] => 1
)
[2] => Array
(
[match] => tag3{also with a tag tag4{and another nested tag5{inside}} of it}
[tagname] => tag3
[content] => also with a tag tag4{and another nested tag5{inside}} of it
[deepness] => 0
)
[3] => Array
(
[match] => tag4{and another nested tag5{inside}}
[tagname] => tag4
[content] => and another nested tag5{inside}
[deepness] => 1
)
[4] => Array
(
[match] => tag5{inside}
[tagname] => tag5
[content] => inside
[deepness] => 2
)
)
正则表达式是这样的:
tag[0-9]+'{[^'}]+
并且您应该首先替换内部标签
我认为没有其他办法了。你需要在每个括号上循环。
$output=array();
$pos=0;
while(preg_match('/tag'd+'{/S',$input,$match,PREG_OFFSET_CAPTURE,$pos)){
$start=$match[0][1];
$pos=$offset=$start+strlen($match[0][0]);
$bracket=1;
while($bracket!==0 and preg_match('/'{|'}/S',$input,$found,PREG_OFFSET_CAPTURE,$offset)){
($found[0][0]==='}')?$bracket--:$bracket++;
$offset=$found[0][1]+1;
}
$output[]=substr($input,$start,$offset-$start);
}