XML解析—无法检索节点的值


XML Parsing - Unable to retrieve value of node

我试图确定为什么我无法在XML文件中获得节点的值。我使用以下PHP代码来解析我的XML文件…

<?php
error_reporting(E_ALL); 
ini_set( 'display_errors','1');
libxml_use_internal_errors(true);
libxml_clear_errors();
// create the reader object
$reader = new XMLReader();
// reader the XML file.
$reader->open('test.xml');
// start reading the XML File.
while($reader->read()) {
    // take action based on the kind of node returned
   switch($reader->nodeType) {
       // read more http://uk.php.net/manual/en/class.xmlreader.php#xmlreader.constants.element
       case (XMLREADER::ELEMENT):
              // get the name of the node.
              $node_name = $reader->name;
              // move the pointer to read the next item
              $reader->read();
              // action based on the $node_name
              if ($node_name == 'PartNumber') {
                $reader->read();
                $data['PartNumber'] = $reader->value;
                var_dump($data);
              };
           break;
       case (XMLREADER::END_ELEMENT):
            // do something based on when the element closes.
            break;
   }
}
?>

下面是我的XML数据的一个例子…

<Items>
  <Item MaintenanceType="C">
    <HazardousMaterialCode>N</HazardousMaterialCode>
    <ItemLevelGTIN GTINQualifier="UP">090127000380</ItemLevelGTIN>
    <PartNumber>0-1848-1</PartNumber>
    <BrandAAIAID>BBVL</BrandAAIAID>
    <BrandLabel>Holley</BrandLabel>
    <PartTerminologyID>5904</PartTerminologyID>
    <Descriptions>
      <Description MaintenanceType="C" DescriptionCode="DES" LanguageCode="EN">Street Carburetor</Description>
      <Description MaintenanceType="C" DescriptionCode="SHO" LanguageCode="EN">Crb</Description>
    </Descriptions>
    <Prices>
      <Pricing MaintenanceType="C" PriceType="JBR">
        <PriceSheetNumber>L30779-13</PriceSheetNumber>
        <CurrencyCode>USD</CurrencyCode>
        <EffectiveDate>2013-01-01</EffectiveDate>
        <Price UOM="PE">462.4600</Price>
      </Pricing>
      <Pricing MaintenanceType="C" PriceType="RET">
        <PriceSheetNumber>L30779-13</PriceSheetNumber>
        <CurrencyCode>USD</CurrencyCode>
        <EffectiveDate>2013-01-01</EffectiveDate>
        <Price UOM="PE">380.5500</Price>
      </Pricing>
      <Pricing MaintenanceType="C" PriceType="WD1">
        <PriceSheetNumber>L30779-13</PriceSheetNumber>
        <CurrencyCode>USD</CurrencyCode>
        <EffectiveDate>2013-01-01</EffectiveDate>
        <Price UOM="PE">314.4700</Price>
      </Pricing>
    </Prices>
    <ExtendedInformation>
      <ExtendedProductInformation MaintenanceType="C" EXPICode="CTO" LanguageCode="EN">US</ExtendedProductInformation>
      <ExtendedProductInformation MaintenanceType="C" EXPICode="NPC" LanguageCode="EN">A</ExtendedProductInformation>
      <ExtendedProductInformation MaintenanceType="C" EXPICode="HTS" LanguageCode="EN">8409914000</ExtendedProductInformation>
      <ExtendedProductInformation MaintenanceType="C" EXPICode="NAF" LanguageCode="EN">B</ExtendedProductInformation>
    </ExtendedInformation>
    <ProductAttributes>
      <ProductAttribute MaintenanceType="C" AttributeID="SKU" LanguageCode="EN">BBVL0-1848-1</ProductAttribute>
      <ProductAttribute MaintenanceType="C" AttributeID="ModDate" LanguageCode="EN">2012-12-31</ProductAttribute>
    </ProductAttributes>
    <Packages>
      <Package MaintenanceType="C">
        <PackageLevelGTIN>00090127000380</PackageLevelGTIN>
        <PackageUOM>EA</PackageUOM>
        <QuantityofEaches>1</QuantityofEaches>
        <Dimensions UOM="IN">
          <Height>7.5000</Height>
          <Width>11.0000</Width>
          <Length>12.2500</Length>
        </Dimensions>
        <Weights UOM="PG">
          <Weight>13.500</Weight>
          <DimensionalWeight>6.09</DimensionalWeight>
        </Weights>
      </Package>
    </Packages>
  </Item>
</Items>

$data的var_dump显示如下…

array(1) {["PartNumber"]=> string(0)"}

没有错误报告。

有没有人能给我指出我所错过的方向?

如果您预定使用XMLReader(例如,因为有非常大的XML文件),我通常建议使用一个名为XMLReaderIterator的库,它允许您专注于解析数据,而不是读取XML。对于你的示例代码,这只是一些小代码:

require('xmlreader-iterators.php'); // https://github.com/hakre/XMLReaderIterator/tree/master/build/include
$xmlFile = "xmlreader-17187636.xml";
$reader = new XMLReader();
$reader->open($xmlFile);
/* @var $partNumbers XMLReaderNode[] */
$partNumbers = new XMLElementIterator($reader, 'PartNumber');
foreach($partNumbers as $partNumber) {
    echo " * ",  $partNumber->readString(), "'n";
}

这展示了如何使用XMLElementIterator来遍历所有名为 PartNumber 的元素,然后读取它们的字符串值。本例中的输出是:

 * 0-1848-1

,因为XML中只有一个part-number元素

这个例子还表明,仍然有XML阅读器,因此您可以在foreach中使用它做任何事情,并且还有其他迭代器允许您查询元素,属性甚至运行浅层xpath查询。

上次我建议库在:

  • 解析/扫描17gb xml文件

如果您不想使用整个库,您可以在XMLReaderNode::readString()方法中找到代码,该方法还显示了如何获得向后兼容的值,这使您的代码更具互操作性,这是该库的好处。参见XMLReader::readString()。

您忘记定义$data = array();

<?php
error_reporting(E_ALL); 
ini_set( 'display_errors','1');
libxml_use_internal_errors(true);
libxml_clear_errors();
$data = array(); //notice this???
// create the reader object
$reader = new XMLReader();
// reader the XML file.
$reader->open('test.xml');
// start reading the XML File.
while($reader->read()) {
    // take action based on the kind of node returned
   switch($reader->nodeType) {
       // read more http://uk.php.net/manual/en/class.xmlreader.php#xmlreader.constants.element
       case (XMLREADER::ELEMENT):
              // get the name of the node.
              $node_name = $reader->name;
              // move the pointer to read the next item
              $reader->read();
              // action based on the $node_name
              if ($node_name == 'PartNumber') {
                $reader->read();
                $data['PartNumber'] = $reader->value;
                var_dump($data);
              };
           break;
       case (XMLREADER::END_ELEMENT):
            // do something based on when the element closes.
            break;
   }
}
?>

如果XML文档不是很大,为什么不使用SimpleXMLElementsimplexml_load_string()甚至DOMDocument->loadXML()呢?并对它们使用XPath查询来获得您想要的任何节点。XMLReader应该在处理不应该预加载而是顺序读取的非常大的文件时使用。

使用simplexmlxpath检索值:

$xml = simplexml_load_string($x); // assume XML in $x
$pns = $xml->xpath("//PartNumber");

现在,$pns包含所有<PartNumber>值的数组。

只检索第一个<PartNumber>,执行:

$pn = $xml->xpath("//PartNumber")[0]; // with PHP >= 5.4

查看它的工作情况:http://3v4l.org/I7eKQ