如何提取HTML代码块中的单行 - How to extract single line in block of HTML code

How to extract single line in block of HTML code

本文关键字：代码单行 HTML 何提取提取 | 更新日期: 2023-09-27

我的内容如下：

<meta property="og:type" content="article" />
<meta property="og:url" content="http://website/fox/" />
<meta property="og:site_name" content="The Fox" />
<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />
<meta property="og:title" content="Fox goes to forest" />

我的要求是提取/获得一行，即meta property=og:image..，因此结果应包含：

<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />

提取HTML的"一行"，或者通常使用正则表达式来解析HTML，是很脆弱的。更健壮的方法是使用HTML解析器，例如DOM扩展提供的支持。

示例：

$html = <<<'HTML'
<meta property="og:type" content="article" />
<meta property="og:url" content="http://website/fox/" />
<meta property="og:site_name" content="The Fox" />
<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209" />
<meta property="og:title" content="Fox goes to forest" />
HTML;
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//meta[@property="og:image"]');
foreach ($nodes as $node) {
    echo $dom->saveHTML($node);
}

输出：

<meta property="og:image" content="http://images.Fox.com/2014/09/foxandforset.gif?w=209">

^<meta property="og:image".*$

试试这个。设置标志m和g。请参阅演示。

http://regex101.com/r/hQ1rP0/48