需要DOMXPath帮助


DOMXPath help needed

我制作了一个代码来搜索特定的文档名称(例如:SZ-1000),并收集div class="box"包含的所有链接。(index.php)文档名称可能包含一个或两个带有ID的文档。(2690426905)它就像一个符咒。我拿回了身份证。

但我想把链接包含的所有附件都列为链接。现在的诀窍是,没有div元素,只有tabledd来指定附件的位置。

  • 站点>
    • 文件名称(SZ-1000)>
      • 文档ID>
        • 附件链接

我认为这个部分有问题:

$xpath->query('//table[@id="content clear-block"]');

Catchable fatal error: Object of class DOMNodeList could not be converted to string in C:'AppServ'www'test'import.php on line 35

import.php中var_dump($articles)的结果;

object(DOMNodeList)#3 (0) { }
object(DOMNodeList)#3 (0) { }

我的代码:

index.php

$site = 'http://192.168.0.1:81/?q=search/node/SZ-1000';
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//div[@class="box"]');
$links = array();
   foreach($articles as $container) {
   $arr = $container->getElementsByTagName("a");
     foreach($arr as $item) {
      $href =  $item->getAttribute("href");
      $links[] = $href;
     }
}
   foreach($links as $link){
     $text = end(split('/',$value));
     echo $text."<br>";
     $wr_out = file_get_contents("http://127.0.0.1/test/import.php?value=".$text);
     echo $wr_out;
  }

import.php

$value = $_GET['value'];
$site = 'http://192.168.0.1:81/?q=node/'.$value;
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//table[@id="content clear-block"]');
$links = array();
   foreach($articles as $container) {
   $arr = $container->getElementsByTagName("a");
      foreach($arr as $item) {
      $href =  $item->getAttribute("href");
      $links[] = $href;
      echo $href;
     }
}

谢谢你的回复!

编辑:

"Catchable致命错误:类DOMNodeList的对象不能是"已转换为C:''AppServ''www''test''import.php中第35行上的字符串

已修复错误:

echo$wr_out->tagName;

好吧,我终于成功了。这是适合老年人的解决方案。使用UTF-8字符编码。

index.php

    <?php

//从外部来源获取一些变量|我将其与谷歌电子表格一起使用

    $get = $_GET['get'];
    $site = 'http://192.168.0.1:81/?q=search/node/'.$get;
    $html = file_get_contents($site);
    //libxml_use_internal_errors(true);
    $doc = new DOMDocument();
    $doc->loadHTML($html);
    $xpath = new DOMXpath($doc);
    $articles = $xpath->query('//div[@class="box"]');
    if(!empty($articles)){
    $links = array();
    foreach($articles as $container) {
       $arr = $container->getElementsByTagName("a");
       foreach($arr as $item) {
          $href =  $item->getAttribute("href");
          $links[] = $href;
       }
    }
    $wr_out = "";
    foreach($links as $value){
        $text = end(split('/',$value));
        $wr_out.=file_get_contents("http://127.0.0.1/projekt/search/import.php?value=".$text);
    }
    if(empty($wr_out))
        echo "There is no document with that ID";
        else
    echo $wr_out;
    }
    else
    echo "There is no document with that ID";
    ?>

import.php

    $value = $_GET['value'];
    $site = 'http://192.168.0.1:81/?q=node/'.$value;
    $html = file_get_contents($site);

    //libxml_use_internal_errors(true);
    $doc = new DOMDocument();
    $doc->loadHTML($html);

    $elements = $doc->getElementsByTagName('tbody');
    $table = $elements->item(0);
    $rows = $table->childNodes;
        foreach ($rows as $node) {
          if($node->tagName == "tr"){
            $a = $node->firstChild->firstChild;
             foreach ($a->attributes as $attr) {
                if($attr->nodeName == "href"){
                    $value = $attr->nodeValue;
                    ?>
                        <!doctype html>
                        <head>
                            <title>Search</title>
                          <meta charset="UTF-8">
                        </head>
                        <body>
                            <table align="center">
                                <tr>
                                    <td></td>
                                    <td class="style-1">
                                    <br><h3>
                                    <?=$value?> | <a href='<?=$value?>'>LINK</a></h3><hr>
                                    </td>
                                </tr>
                            </table>
                        </body><?
                }
            }
         }
    }?>