复制链接,标题在<标记和日期值从国外网站使用Zend框架


Copy links, title within <a> tag and date values from foreign website using Zend Framework

可以使用Zend请求提取链接,标签内的字符串也来自国外网站的数据值,并将所有这些复制到数组并返回它?

以下面的网站http://bills.ru/为例,从下面的表格中提取"события на долговом рынке",所有的数据应该存储在具有以下结构的数组中:

id

日期标题

url

或者谁能至少给出一些实现Zend Request的好例子?

我建议使用像Goutte这样的东西,它不会让你过滤返回的html。

如果你不想使用额外的库,你也可以使用Zend'Dom从你的Request查询html

这是我设法完成以下任务的代码,它有效。

<?php
use Zend'Http'Client;
use Zend'Dom'Query;
/**
* Extracts date values, titles and links from block "события на долговом рынке" then save all date in 1   
 * array and prints it.
 * 
 * Using Zend'Http'Client to make connection to website and further manipulate with Zend'Dom'Query's CSS selectors
* to retrieve date values, link and titles within block "событья на долговом рынке". Three private method are used to 
* return values for each type and 1 public function used for retrieving 
*
* @var client is Zend'Http'Client object and makes connection using function setUri() with declared website
* @var response servers as getting response from requested website
* @var dom is a Zend'Dom'Query object that allows manipulating with Zend'Http'Client objects 
* @var results is a Zend'Dom'NodeList object made by using function execute()
* @var result used in foreach loop and for retrieving titles and url from a tag
* @var results_date same as @var results but for date values
* @var result_date same as @var result but for date values
* @var dateArray array where date values will be stored 
* @var valuesArray array where data will be stored and printed afterwards
* @var html used to story content from @var client
*/
class BILLS 
{
public $client;
public $response;
public $dom;
public $results;
public $result;
public $results_date;
public $result_date;
public $dateArray;
public $valuesArray;
public $html;
/**
 * When new object with following class is created an object Zend'Http'Client is created and set Uri attribute.   
 * A request is being done to this object and data is put into $html variable for further use.
 * @see client, response, html
 */
function __construct ()
{
    $this->client = new 'Zend'Http'Client();
    $this->client->setUri('http://bills.ru');
    $this->client->send();
    $this->response = $this->client->getResponse();
    $this->html = $this->response->getBody();
}
/**
 * Returns date values within object 
 * @see result_date
 */ 
private function _date()
{  
    return $this->result_date->textContent;
}
/**
 * Returns text content within object 
 * @see result
 */ 
private function _title()
{
    return $this->result->textContent;
}
/**
 * Returns url within object 
 * @see result
 */ 
private function _url()
{
    return $this->result->getAttribute('href');
}
/**
 * If connection has no problems a new Query object is created and searched for a tags with class new. Then 
 * using a foreach loop found data is stored in array and printed to screen. Uses 3 private function for returning
 * values for each type that will be stored in array an printed afterwards.
 *
 * @see dom, results_date, dateArray, results, valuesArray, _date(), _url(), _title()
 * 
 */
public function printTask()
{
    $iteration = 0;
    $iterationData = 0;
    if($this->response->getStatusCode() == 200)
    {
        $this->dom = new Query($this->html);
        $this->results_date = $this->dom->execute('table tr  td.news');
        foreach ($this->results_date as $this->result_date) 
        {
            if($iterationData < 5)
            {
               $dateArray[$iterationData] = $this->_date();
               $iterationData++;
            }
        }
        $this->results = $this->dom->execute('table  tr  td  a.news');
        foreach ($this->results as $this->result) 
        {
            if($iteration < 5)
            {
             $valuesArray = array(
                    'id' => $iteration+1,
                    'date' => $dateArray[$iteration],
                    'title' => $this->_title(),
                    'url' => "http://bills.ru".$this->_url()
                    );
            echo '<pre>';
            print_r($valuesArray);
            echo  '</pre>';
                $iteration++;
            }   
        }
    } 
}
}
$object = new BILLS;
$object->printTask();
?>