网页抓取从 XML 获取内容 - web scraping get content from xml

web scraping get content from xml

本文关键字：获取 XML 抓取网页 | 更新日期: 2023-09-27

如何使用PHP从XML页面获取内容。内容如下：

 <entry>
   <title>News</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>
 <entry>
   <title>News2</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username2</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>
 <entry>
   <title>News3</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username3</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>
 <entry>
   <title>News4</title>
     <link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>
     <id>tag:www.website.com,2012-04-25:2688327:BlogPost:1569917</id>
     <updated>2012-04-25T08:30:00.000Z</updated>
     <author>
     <name>Username4</name>
     <uri>http://www.website.com/profile/username</uri>
     </author>
      <summary type="html">
      Hi this is the latest news
      </summary>
</entry>

如何使用 php 获取标题、博客链接<link rel="alternate" href="http://www.website.com/detail/2688327:BlogPost:1569917"/>、作者详细信息（如名称和 uri（配置文件链接））以及摘要的数组？

查看 simpleXML， XPath
http://php.net/manual/en/book.simplexml.php

   $file = 'url or file name';
    $xml = simplexml_load_file('$file');
    $list= $xml->xpath("/entry"); // root/entry ...
    print $list[0]->id; 
    #var_dump($list);