从多个链接抓取数据


Data scraping from multiple links

我有一个PHP代码,它将从"level0 nav-1 active parent"的类名中检索数据。有没有一种方法可以让我提供一个链接数组,并为每个循环的链接数组使用稍微不同的类名,而不必为类似的10个链接重复相同的代码?

类似于:第一个链接(https://www.postme.com.my/men-1.html)-使用类("level 0 nav-1 active parent")第二(https://www.postme.com.my/women.html)-使用类("level 0 nav-2 active parent")第三(https://www.postme.com.my/children.html)-使用类("0级导航-3活动父级")

注意到递增的导航-#?

这是php代码:

<?php
header('Content-Type: text/html; charset=utf-8');
$grep = new DoMDocument();
@$grep->loadHTMLFile("https://www.postme.com.my/men-1.html");
$finder = new DomXPath($grep);
$classCat = "level0 nav-1 active parent";
$nodesCat = $finder->query("//*[contains(@class, '$classCat')]");
$i = 0;
    foreach ($nodesCat as $node) {
    $span = $node->childNodes;
    $replace = str_replace("Items 1-12 of", "",$span->item(1)->nodeValue);
    echo $replace. " : ";
  }
  // Check another link using class name of "level0 nav-2 active parent"
  //repeat code 
  @$grep->loadHTMLFile("https://www.postme.com.my/women.html");
$finder = new DomXPath($grep);
$classCat = "level0 nav-2 active parent";
$nodesCat = $finder->query("//*[contains(@class, '$classCat')]");
$i = 0;
    foreach ($nodesCat as $node) {
    $span = $node->childNodes;
    $replace = $span->item(1)->nodeValue;
    echo $replace. " : ";
  }
//check another link with class name "level0 nav-3 active parent".
//notice the incrementing nav-#?
//I don't want to make the code long just because each link is using a slightly different class name to refer to the data.
?>

感谢

我要做的是获取这些链接的父级(<li>),即<ul id="nav">。然后从那里开始。提取值。示例:

$dom = new DOMDocument();
@$dom->loadHTMLFile('https://www.postme.com.my/men-1.html');
$xpath = new DOMXpath($dom);
$categories = $xpath->query('//ul[@id="nav"]/li');
foreach($categories as $category) {
    echo $xpath->query('./a/span', $category)->item(0)->nodeValue . '<br/>';
}