我正在查看从网站、表格中抓取数据,并在 PHP 中显示在干净的表格中。
网站示例如下,您会注意到航班数据表。关于如何让 PHP 循环数据并将其放入表中的任何想法?
数据示例
是的,我建议使用 Xpath
<h1>This is scraping flight radar:</h1>
<?php
$url = "https://www.flightradar24.com/data/flights/southwest-airlines-wn-swa";
$html = file_get_contents($url);
libxml_use_internal_errors(true);
$doc = new 'DOMDocument();
if($doc->loadHTML($html))
{
$result = new 'DOMDocument();
$result->formatOutput = true;
$table = $result->appendChild($result->createElement("table"));
$thead = $table->appendChild($result->createElement("thead"));
$tbody = $table->appendChild($result->createElement("tbody"));
$xpath = new 'DOMXPath($doc);
$newRow = $thead->appendChild($result->createElement("tr"));
foreach($xpath->query("//table[@id='tbl-datatable']/thead/tr/th[position()>1]") as $header)
{
$newRow->appendChild($result->createElement("th", trim($header->nodeValue)));
}
foreach($xpath->query("//table[@id='tbl-datatable']/tbody/tr") as $row)
{
$newRow = $tbody->appendChild($result->createElement("tr"));
foreach($xpath->query("./td[position()>1 and position()<7]", $row) as $cell)
{
$newRow->appendChild($result->createElement("td", trim($cell->nodeValue)));
}
}
echo $result->saveXML($result->documentElement);
}
?>
与任何抓取工作一样,请记住,您可能会违反他们的服务条款,尤其是在您重新发布内容时。话虽如此,https://github.com/FriendsOfPHP/Goutte 非常适合抓取这样的任务。
<?php
require 'vendor/autoload.php';
use Goutte'Client;
$data_url = 'https://www.flightradar24.com/data/flights/southwest-airlines-wn-swa';
$client = new Client();
$crawler = $client->request('GET', $data_url);
$crawler->filter('#tbl-datatable')->each(function ($node) {
print $node->html()."'n";
});