URL请求返回奇怪字符而不是重音符号的结果


Result from URL request returning weird characters instead of accents

我的问题是口音没有显示在print_r()的输出中。

这是我的代码:

<?php
include('./lib/simple_html_dom.php');
error_reporting(E_ALL);
if (isset($_GET['q'])){
$q = $_GET['q'];
$keyword=urlencode($q);
$url="https://www.google.com/search?q=$keyword";
$html=file_get_html($url);
$results=$html->find('li.g');
$G_tot = sizeof($results)-1;
for($g=0;$g<=$G_tot;$g++){
$results=$html->find('li.g',$g);
$array_ttl_google[]=$results->find('h3.r',0)->plaintext;
$array_desc_google[]=$results->find('span.st',0)->plaintext;
$array_href_google[]=$results->find('cite',0)->plaintext;
}
print_r($array_desc_google);
}
?>

以下是print_r:的结果

Array ( [0] => �t� m (plural �t�s)...

你认为决议是什么?

您可以做的3件基本事情:

  1. 将页面编码设置为UTF-8-在页面开头添加:header('Content-Type: text/html; charset=utf-8');
  2. 确保您的代码文件保存为UTF-8(无BOM)
  3. 添加一个函数,将解析后的字符串转换为UTF-8(以防其他网站使用不同的编码)

你的代码应该是这样的(测试-工作得很好,尝试了英语和希伯来语的结果):

 <?php
 header('Content-Type: text/html; charset=utf-8');
 include('simple_html_dom.php');
 error_reporting(0);
 if (isset($_GET['q'])){
     $q = $_GET['q'];
     $keyword=urlencode($q);
     $url="https://www.google.com/search?q=$keyword";
     $html=file_get_html($url);
     //Make sure we received UTF-8:
     $encoding = @mb_detect_encoding($html);
     if ($encoding && strtoupper($encoding) != "UTF-8")
        $html = @iconv($encoding, "utf-8//TRANSLIT//IGNORE", $html);
     //Proceed with your code:
     $results=$html->find('li.g');
     $G_tot = sizeof($results)-1;
     for($g=0;$g<=$G_tot;$g++){
         $results=$html->find('li.g',$g);
         $array_ttl_google[]= $results->find('h3.r',0)->plaintext;
         $array_desc_google[]= $results->find('span.st',0)->plaintext;
         $array_href_google[] = $results->find('cite',0)->plaintext;
      }
      print_r($array_desc_google);
 } else {
    echo "You forgot to set the 'q' variable in your url.";
 } 
 ?>