我正在尝试获取这两个div之间的数据:
--<div id="p_tab4" class="p_desc" style="display: block;">
--<div id="p_top_cats" class="p_top_cats">
我正在使用下面的正则表达式,但它没有给我任何东西:
/<div id='"p_tab4'" class='"p_desc'" style='"display: block;'">(.*?)<div id='"p_top_cats'" class='"p_top_cats'">/
如何更正此正则表达式?
假设您的 HTML 格式良好,如下所示:
<div id="p_tab4" class="p_desc" style="display: block;">...</div>
some stuff in between
<div id="p_top_cats" class="p_top_cats">
</div>
您可以使用 DOMDocument 和 XPath:
$html = <<<'EOS'
<div id="p_tab4" class="p_desc" style="display: block;">
some stuff in between
<div id="p_top_cats" class="p_top_cats">
</div>
</div>
EOS;
$doc = new DOMDocument;
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//node()[preceding-sibling::div[@id="p_tab4"] and following-sibling::div[@id="p_top_cats"]]';
foreach ($xpath->query($query) as $node) {
echo $node->textContent, PHP_EOL;
}
您的正则表达式似乎没有错,但您需要打开 DOTALL 模式s
,以便正则表达式中的点也与换行符(换行符)匹配。
~<div id='"p_tab4'" class='"p_desc'" style='"display: block;'">(.*?)<div id='"p_top_cats'" class='"p_top_cats'">~s
法典:
$re = '~<div id='"p_tab4'" class='"p_desc'" style='"display: block;'">(.*?)<div id='"p_top_cats'" class='"p_top_cats'">~s';
$str = "--<div id='"p_tab4'" class='"p_desc'" style='"display: block;'">'n--<div id='"p_top_cats'" class='"p_top_cats'">";
preg_match($re, $str, $matches);
echo $matches[1];
演示