我制作了一个代码来搜索特定的文档名称(例如:SZ-1000),并收集div class="box"
包含的所有链接。(index.php
)文档名称可能包含一个或两个带有ID的文档。(2690426905)它就像一个符咒。我拿回了身份证。
但我想把链接包含的所有附件都列为链接。现在的诀窍是,没有div
元素,只有table
或dd
来指定附件的位置。
- 站点>
- 文件名称(SZ-1000)>
- 文档ID>
- 附件链接
- 文档ID>
- 文件名称(SZ-1000)>
我认为这个部分有问题:
$xpath->query('//table[@id="content clear-block"]');
Catchable fatal error: Object of class DOMNodeList could not be converted to string in C:'AppServ'www'test'import.php on line 35
import.php中var_dump($articles)
的结果;
object(DOMNodeList)#3 (0) { }
object(DOMNodeList)#3 (0) { }
我的代码:
index.php
$site = 'http://192.168.0.1:81/?q=search/node/SZ-1000';
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//div[@class="box"]');
$links = array();
foreach($articles as $container) {
$arr = $container->getElementsByTagName("a");
foreach($arr as $item) {
$href = $item->getAttribute("href");
$links[] = $href;
}
}
foreach($links as $link){
$text = end(split('/',$value));
echo $text."<br>";
$wr_out = file_get_contents("http://127.0.0.1/test/import.php?value=".$text);
echo $wr_out;
}
import.php
$value = $_GET['value'];
$site = 'http://192.168.0.1:81/?q=node/'.$value;
$html = file_get_contents($site);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//table[@id="content clear-block"]');
$links = array();
foreach($articles as $container) {
$arr = $container->getElementsByTagName("a");
foreach($arr as $item) {
$href = $item->getAttribute("href");
$links[] = $href;
echo $href;
}
}
谢谢你的回复!
编辑:
"Catchable致命错误:类DOMNodeList的对象不能是"已转换为C:''AppServ''www''test''import.php中第35行上的字符串
已修复错误:
echo$wr_out->tagName;
好吧,我终于成功了。这是适合老年人的解决方案。使用UTF-8字符编码。
index.php
<?php
//从外部来源获取一些变量|我将其与谷歌电子表格一起使用
$get = $_GET['get'];
$site = 'http://192.168.0.1:81/?q=search/node/'.$get;
$html = file_get_contents($site);
//libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$articles = $xpath->query('//div[@class="box"]');
if(!empty($articles)){
$links = array();
foreach($articles as $container) {
$arr = $container->getElementsByTagName("a");
foreach($arr as $item) {
$href = $item->getAttribute("href");
$links[] = $href;
}
}
$wr_out = "";
foreach($links as $value){
$text = end(split('/',$value));
$wr_out.=file_get_contents("http://127.0.0.1/projekt/search/import.php?value=".$text);
}
if(empty($wr_out))
echo "There is no document with that ID";
else
echo $wr_out;
}
else
echo "There is no document with that ID";
?>
import.php
$value = $_GET['value'];
$site = 'http://192.168.0.1:81/?q=node/'.$value;
$html = file_get_contents($site);
//libxml_use_internal_errors(true);
$doc = new DOMDocument();
$doc->loadHTML($html);
$elements = $doc->getElementsByTagName('tbody');
$table = $elements->item(0);
$rows = $table->childNodes;
foreach ($rows as $node) {
if($node->tagName == "tr"){
$a = $node->firstChild->firstChild;
foreach ($a->attributes as $attr) {
if($attr->nodeName == "href"){
$value = $attr->nodeValue;
?>
<!doctype html>
<head>
<title>Search</title>
<meta charset="UTF-8">
</head>
<body>
<table align="center">
<tr>
<td></td>
<td class="style-1">
<br><h3>
<?=$value?> | <a href='<?=$value?>'>LINK</a></h3><hr>
</td>
</tr>
</table>
</body><?
}
}
}
}?>