这是 HTML:
<article class="module_article featured">
<a title="Exclusive: Strictly's Vincent Simone welcomes baby boy" href="h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/"><h1 class="article_title">Exclusive: Strictly's Vincent Simone welcomes baby boy</h1></a> <a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<p>HELLO! Online can exclusively reveal that Strictly Come Dancing professional Vincent...</p>
</article>
<article class="module_article featured">
<a title="Exclusive: Strictly's Vincent Simone welcomes baby boy" href="h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/"><h1 class="article_title">Exclusive: Strictly's Vincent Simone welcomes baby boy</h1></a> <a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<a href="/healthandbeauty/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/">
<img src="/imagenes/portadas/1-40-vincent-s.jpg">
</a>
<p>HELLO! Online can exclusively reveal that Strictly Come Dancing professional Vincent...</p>
</article>
这是我的 XPATH:
$articleLinks = $finder->query('article[contains(@class,"module_article")]//@href');
如您所见,它抓住了两个hrefs
.我只需要第一个。
使用这个 XPATH 表达式:
(/article[contains(@class,"module_article")]//@href)[1]
输出:
h/mother-and-baby/2013091914634/vincent-simone-baby-boy-born/
更新(根据上次编辑)
/article[contains(@class,"module_article")]/a[1]/@href
演示示例:
<foo>
<a href='#1'>1</a>
<bar>
<a href='#2'>2</a>
</bar>
</foo>
<foo>
<a href='#3'>3</a>
<baz> <a href='#4'>4</a> </baz>
</foo>
XPATH
/foo/a[1]/@href
输出:
#1
#3
要检索带有href
的第一个<a>
:
$finder->query('article[contains(@class,"module_article")]/a[1]/@href')