我的问题是口音没有显示在print_r()
的输出中。
这是我的代码:
<?php
include('./lib/simple_html_dom.php');
error_reporting(E_ALL);
if (isset($_GET['q'])){
$q = $_GET['q'];
$keyword=urlencode($q);
$url="https://www.google.com/search?q=$keyword";
$html=file_get_html($url);
$results=$html->find('li.g');
$G_tot = sizeof($results)-1;
for($g=0;$g<=$G_tot;$g++){
$results=$html->find('li.g',$g);
$array_ttl_google[]=$results->find('h3.r',0)->plaintext;
$array_desc_google[]=$results->find('span.st',0)->plaintext;
$array_href_google[]=$results->find('cite',0)->plaintext;
}
print_r($array_desc_google);
}
?>
以下是print_r
:的结果
Array ( [0] => �t� m (plural �t�s)...
你认为决议是什么?
您可以做的3件基本事情:
- 将页面编码设置为UTF-8-在页面开头添加:
header('Content-Type: text/html; charset=utf-8');
- 确保您的代码文件保存为UTF-8(无BOM)
- 添加一个函数,将解析后的字符串转换为UTF-8(以防其他网站使用不同的编码)
你的代码应该是这样的(测试-工作得很好,尝试了英语和希伯来语的结果):
<?php
header('Content-Type: text/html; charset=utf-8');
include('simple_html_dom.php');
error_reporting(0);
if (isset($_GET['q'])){
$q = $_GET['q'];
$keyword=urlencode($q);
$url="https://www.google.com/search?q=$keyword";
$html=file_get_html($url);
//Make sure we received UTF-8:
$encoding = @mb_detect_encoding($html);
if ($encoding && strtoupper($encoding) != "UTF-8")
$html = @iconv($encoding, "utf-8//TRANSLIT//IGNORE", $html);
//Proceed with your code:
$results=$html->find('li.g');
$G_tot = sizeof($results)-1;
for($g=0;$g<=$G_tot;$g++){
$results=$html->find('li.g',$g);
$array_ttl_google[]= $results->find('h3.r',0)->plaintext;
$array_desc_google[]= $results->find('span.st',0)->plaintext;
$array_href_google[] = $results->find('cite',0)->plaintext;
}
print_r($array_desc_google);
} else {
echo "You forgot to set the 'q' variable in your url.";
}
?>