PHP字符串具有相同的编码(UTF8),在浏览器中看起来相同,但不相等


PHP strings have same encoding (UTF8) and appear as identical in browser but are not equal

所以我需要比较字符串(1是从CURL调用远程URL的结果,使用字符集UTF8),另一个是硬编码在我的脚本(UTF8也是)。字符串看起来是一样的,但是当我使用strcmp()比较它们时,结果是-44。我试着把它们都剪掉,但结果还是一样。

我仔细检查了他们的编码mb_detect_encoding(),他们似乎都在UTF8(正如我所期望的那样)。

我也用preg_match('!! !u', $string),这似乎是一种准确的方法来检测它们是否在UTF8中。结果是1,所以它们都是。

bin2hex (string1) h ps://graph.facebook.com/v2.3/? id = h p://www.topito.com/top-images-monde-chats-connards-de-felins

68747470733 a2f2f67726170682e66616365626f6f6b2e636f6d2f76322e332f3f69643d3c6c696e‌6 b3e687474703a2f2f7777772e746f7069746f2e636f6d2f746f702d696d616765732d6d6f6e64652‌d63686174732d636f6e6e617264732d64652d66656c696e733c2f6c696e6b3e

bin2hex (string2相等)h ps://graph.facebook.com/v2.3/? id = h p://www.topito.com/top-images-monde-chats-connards-de-felins

68747470733 a2f2f67726170682e66616365626f6f6b2e636f6d2f76322e332f3f69643d68747470‌3 a2f2f7777772e746f7069746f2e636f6d2f746f702d696d616765732d6d6f6e64652d63686174732‌d636f6e6e617264732d64652d66656c696e73

如何使它们相等?我试图将它们都转换为utf8(从utf8 ^^)使用mb_convert_encoding(),但它们仍然不等于....

谢谢

编辑我提取我的字符串(这是URL)使用cURL从这个提要:h**p://www.topito.com/feed

我的旋度函数是:

  $header[] = "Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
  $header[] = "Cache-Control: max-age=0";
  $header[] = "Connection: keep-alive";
  $header[] = "Keep-Alive: timeout=5, max=100";
  $header[] = "Accept-Charset: utf-8;q=0.7,*;q=0.7"; // Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
  $header[] = "Accept-Language: en-us,en;q=0.5";
  $header[] = ""; 
  $curl = curl_init ();
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
  curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
  curl_setopt($curl, CURLOPT_USERAGENT, $useragent);
  curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
  curl_setopt($curl, CURLOPT_REFERER, "http://www.google.fr");
  curl_setopt($curl, CURLOPT_HEADER, 0);
  curl_setopt($curl, CURLINFO_HEADER_OUT, 1);
  curl_setopt($curl, CURLOPT_COOKIEFILE, getcwd().'/cookies.txt');
  curl_setopt($curl, CURLOPT_COOKIEJAR, getcwd().'/cookies.txt');
  curl_setopt($curl, CURLOPT_CUSTOMREQUEST, 'GET');
  curl_setopt($curl, CURLOPT_TIMEOUT, 30);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  $html = curl_exec($curl);
  curl_close ( $curl );

如果您查看您复制粘贴在这里的HTML源代码,它们是不一样的。第二个字符串有一个额外的实体&# 8203;(检查第二个' ‌​')

http://pasteboard.co/wBd2Ea4.png