Regex价格<；p>；用php标记html中的块 - Regex prices <p> tag block out of html with php

Regex prices <p> tag block out of html with php

本文关键字：标记 html php gt 价格 lt Regex | 更新日期: 2023-09-27

我正试图从网页中刮出价格块，我想匹配包含价格的开头和结尾段落标记之间的内容。然而，问题是在html输出源中，它被吐到带有多个空格的多行上。这是输出的示例http://pastebin.com/hfeuHqTN

我正在尝试使用：

$pricesClass = '/<p class="price-wrap">'n(.*)/';
preg_match_all($pricesClass, $page, $pricesMatches);

在结束段落标记之前，我如何将整个段落与价格包装类进行匹配？

目前，它只匹配前两行：

<p class="price-wrap"><strong class="product-price" itemprop="price">

我想匹配整个东西，例如

 <p class="price-wrap"><strong class="product-price" itemprop="price"> £120</strong> was&nbsp;<del>£186.00</del></p>

仅使用适当的HTML解析器，如DOMDocument和preg_replace（'s+）来删除"空白字符"（任何Unicode分隔符、制表符、换行符、回车符、垂直制表符、表单换行符）

$dom = new DOMDocument();
$dom->loadHTML(file_get_contents("http://thesite.com");
$xpath = new DOMXpath($dom);
foreach ($xpath->query("//p[@class='price-wrap']") as $pText){
    echo preg_replace("/'s+/", "", $pText->textContent);
}

Ideone演示