我使用Goutte(它使用Guzzle)来提取内容,我的脚本以一个错误结束,虽然我在try/catch中运行:
Error: Client error: `GET http://example.com/C42C9CA3` resulted in a `403 Forbidden` response:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"htt (truncated...)
这是我的:
use Goutte'Client;
$HTTPconfig = [ "curl" => [
CURLOPT_TIMEOUT => 60,
CURLOPT_CONNECTTIMEOUT => 60,
CURLOPT_SSL_VERIFYPEER => false,
],
['http_errors' => false]
];
$HTTPclient = new 'Goutte'Client;
$HTTPclient->setClient(new 'GuzzleHttp'Client($HTTPconfig));
$HTTPclient->setHeader('user-agent', 'Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20121202 Firefox/20.0');
try {
$crawler = $HTTPclient->request('GET', $url);
$doc = $crawler->html();
} catch (Exception $e) {
write($e->getMessage());
continue;
}
Try with:
} catch ('Exception $e) {
代替:
} catch (Exception $e) {
编辑:如果你使用的是PHP-7,你可以尝试用斜杠来捕获Throwable,如下所示:
} catch ('Throwable $e) {
希望对您有所帮助
删除['http_errors' => false]
选项。默认为true
, 4xx/5xx响应码除外。