如何编写具有默认名称空间和属性条件的xpath查询


How to write xpath query with default namespace and attribute condition

我有这样的xml:

<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
      <Worksheet ss:Name="Name1">
        something
      </Worksheet>
      <Worksheet ss:Name="Name2">
        something else
      </Worksheet>
    </Workbook>

查询应该是这样的,这将使我的Worksheet元素具有ss:Name属性Name1。由于默认的名称空间,我必须设置这样的第一个条件:

//*[name()="Worksheet"]

但是我不知道如何添加属性条件。。。

-------更新-------因为我在这里找不到解决方案,所以都是xml文件(excel生成的文件(:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
  <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
    <Author>Don diego</Author>
    <LastAuthor>Don diego</LastAuthor>
    <Created>2013-04-18T07:20:33Z</Created>
    <LastSaved>2013-04-18T07:20:33Z</LastSaved>
    <Company>CEI</Company>
    <Version>14</Version>
  </DocumentProperties>
  <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
    <AllowPNG/>
  </OfficeDocumentSettings>
  <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
    <WindowHeight>7740</WindowHeight>
    <WindowWidth>13395</WindowWidth>
    <WindowTopX>360</WindowTopX>
    <WindowTopY>30</WindowTopY>
    <ProtectStructure>False</ProtectStructure>
    <ProtectWindows>False</ProtectWindows>
  </ExcelWorkbook>
  <Styles>
    <Style ss:ID="Default" ss:Name="Normal">
      <Alignment ss:Vertical="Bottom"/>
      <Borders/>
      <Font ss:FontName="Calibri" x:CharSet="238" x:Family="Swiss" ss:Size="11" ss:Color="#000000"/>
      <Interior/>
      <NumberFormat/>
      <Protection/>
    </Style>
  </Styles>
  <Worksheet ss:Name="Sheet1">
    <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultRowHeight="15"/>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
      <PageSetup>
        <Header x:Margin="0.3"/>
        <Footer x:Margin="0.3"/>
        <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
      </PageSetup>
      <Selected/>
      <Panes>
        <Pane>
          <Number>3</Number>
          <ActiveCol>1</ActiveCol>
        </Pane>
      </Panes>
      <ProtectObjects/>
      <ProtectScenarios/>
    </WorksheetOptions>
  </Worksheet>
  <Worksheet ss:Name="Sheet2">
    <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultRowHeight="15"/>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
      <PageSetup>
        <Header x:Margin="0.3"/>
        <Footer x:Margin="0.3"/>
        <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
      </PageSetup>
      <Selected/>
      <Panes>
        <Pane>
          <Number>3</Number>
          <ActiveCol>1</ActiveCol>
        </Pane>
      </Panes>
      <ProtectObjects/>
      <ProtectScenarios/>
    </WorksheetOptions>
  </Worksheet>
  <Worksheet ss:Name="Sheet3">
    <Table ss:ExpandedColumnCount="1" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultRowHeight="15"/>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
      <PageSetup>
        <Header x:Margin="0.3"/>
        <Footer x:Margin="0.3"/>
        <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
      </PageSetup>
      <Selected/>
      <Panes>
        <Pane>
          <Number>3</Number>
          <ActiveCol>1</ActiveCol>
        </Pane>
      </Panes>
      <ProtectObjects/>
      <ProtectScenarios/>
    </WorksheetOptions>
  </Worksheet>
</Workbook>

我想要获得具有XPath属性"Sheet1"的Worksheet元素。这是我得到的:

$uri = $this->doc->getDocNamespaces()['']; //$this->doc is obiect of simplexmlelement class
$this->doc->registerXPathNamespace('default', $uri); //'urn:schemas-microsoft-com:office:spreadsheet'
$current_worksheet = $this->doc->xpath('/*/default:Worksheet[@ss:Name = "Sheet1"]');
die(var_dump($current_worksheet));//empty array :(

目前,$current_worksheet是一个空数组:(看起来默认名称空间与ss名称空间相同(相同的urn(?

/*/ss:Worksheet[@ss:Name = "Name1"]

你有两个选择。首先,我从一个我认为更正确的开始。它利用了名称空间。要使其工作,您需要使用相应的URI注册名称空间前缀,这里有两个名称空间:

Prefix: default
URI   : urn:schemas-microsoft-com:office:spreadsheet
Prefix: ss
URI   : urn:schemas-microsoft-com:office:spreadsheet

然后您可以查询:

/*/default:Worksheet[@ss:Name = "Name1"]

第二个变体执行完全相同的xpath查询,但忽略所有非默认名称空间的名称空间。这适用于local-name(),并且更复杂:

/*/*[local-name()="Worksheet"][@*[local-name()="Name" and . = "Name1"]]

正如您所看到的,第一个变体更可取,因为它可读性更强。此外,它更为独特,因为它为每个具体元素命名,而不仅仅是通过本地名称。

下面是一个简短的示例,说明如何注册XML名称空间前缀,以便将其与xpath一起使用。这是必要的,因为默认名称空间是非空的:

$xml = simplexml_load_string($string);
$uri = $xml->getDocNamespaces()[''];
$xml->registerXPathNamespace('default', $uri);
$result = $xml->xpath('/*/default:Worksheet[@ss:Name = "Name1"]');
echo trim($result[0]), "'n"; # something

在线演示-值得记住:就像每个元素一样,每个属性也可以有自己的名称空间。属性名称空间不是自动的元素名称空间(仅是文档默认名称空间(。

更多类似内容:

选择属性值为">Name1"的">工作表"元素。

//Worksheet[@ss:Name='Name1']
/x:Workbook/x:Worksheet[@ss:Name='Name1']

在调用应用程序中将名称空间前缀"x"answers"ss"绑定到适当的名称空间URI,使用运行XPath所使用的任何API。

好的,我找到了hakre的xpath不适合我的原因,我不知道为什么,但这段代码

$xml = <<<XML
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
</Workbook>
XML;
$el = new SimpleXmlElement($xml);
$child = $el->addChild('Worksheet');
$child->addAttribute('xmlns:ss:Name', 'Sheet1');
$result = $el->xpath("ss:Worksheet[@ss:Name='Sheet1']");

没用。我必须创建新的SimpleXMLElement才能使其工作,如下所示:

$xml = <<<XML
<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
</Workbook>
XML;
$el = new SimpleXmlElement($xml);
$child = $el->addChild('Worksheet');
$child->addAttribute('xmlns:ss:Name', 'Sheet1');
$el = new SimpleXMLElement($el->asXML()); //refreshing of SimpleXMLElement
$result = $el->xpath("ss:Worksheet[@ss:Name='Sheet1']"); //now it work like a charm

感谢的帮助