PHP DOMDocument:如何解析带有冒号的自定义XML/RSS标签名称


PHP DOMDocument : How to parse custom XML/RSS tag names with COLONS?

我有以下RSS要解析,类似于:

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
    <channel>
        <item>
            <title>About Apples</title>
            <author>David K. Lowie</title>
            <description>Some description about apples</description>
            <xCal:description>This is the full description about apples</xCal:description>
        </item>
        <item>
            <title>About Oranges</title>
            <author>Marry L. Jones</title>
            <description>Some description about oranges</description>
            <xCal:description>This is the full description about oranges</xCal:description>
        </item>
    </channel>
</rss>

在PHP中,我解析它如下:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
foreach( $rss->getElementsByTagName("item") as $node ) {
    echo $node->getElementsByTagName("title")->item(0)->nodeValue,
    echo $node->getElementsByTagName("author")->item(0)->nodeValue,
    echo $node->getElementsByTagName("description")->item(0)->nodeValue,
    echo $node->getElementsByTagName("xCal:description")->item(0)->nodeValue,
}

我可以读取除了那里的xCal:description节点。(节点名称完全相同:descriptionxCal:description)

  1. 如何解析(读取) xCal:description
  2. 是否因为节点名称相似,如:descriptionxCal:description ?

(我不能更改RSS源,因为它不在我的控制之下。)

使用getElementsByTagNameNS():

$node->getElementsByTagNameNS("urn:ietf:params:xml:ns:xcal", "description")->item(0)->nodeValue

虽然使用DOM方法的感知名称空间的变体是正确的答案,但您可能想看看Xpath。这是从DOM中获取数据的一种更舒适的方式。

对于Xpath表达式,可以根据需要为名称空间注册自己的前缀。

$rss = new DOMDocument();
$rss->load("http://www.example.com/books.rss");
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('xc', 'urn:ietf:params:xml:ns:xcal');
foreach($xpath->evaluate("//item") as $item) {
    echo $xpath->evaluate('string(title)', $item), "'n";
    echo $xpath->evaluate('string(author)', $item), "'n";
    echo $xpath->evaluate('string(description)', $item), "'n";
    echo $xpath->evaluate('string(xc:description)', $item), "'n";
}
输出:

About Apples
David K. Lowie
Some description about apples
This is the full description about apples
About Oranges
Marry L. Jones
Some description about oranges
This is the full description about oranges