字符编码难题与PHP/MS访问


character encoding puzzle with PHP/MS Access

注意:这是MS Access 2000,这个PHP文件是通过ajax调用来调用的…

在这个PHP文件的开头,我放了

ini_set('default_charset', 'utf-8');

下面的$标记来自以下几行

$search_string = $_GET[ 'search_string' ];
$search_tokens = explode( " ", $search_string );
$token = $search_tokens[ 0 ];

当我有一个没有法语重音字符的"token"时,这个工作正常:

$sql="SELECT * FROM tblFrEng WHERE French = '$token'";
echo "=== SQL is $sql<br>";
$sth = $dbh->prepare( $sql );
$sth->execute();

但是,尽管SQL带有法语单词"r,和,和"看起来很好(像这样):

=== SQL is SELECT * FROM tblFrEng WHERE French = 'référé'

不幸的是,查询返回0行…即使有记录,它也应该返回…所以在我看来,字符编码可能是问题所在

NB我也尝试使用utf8_encode编码,但这,正如所指出的,没有意义和乱码的SQL字符串…

这是我得到工作的PHP代码。我不得不使用mb_convert_encoding(),这是PHP"多字节字符串"("mbstring")扩展的一部分。

代码:

<?php
// NB: save this PHP script as an ANSI text file, not a UTF-8 encoded file
$dbh = new PDO(
        'odbc:Driver={Microsoft Access Driver (*.mdb)};' .
        'Dbq=C:''Users''Public''acc2000.mdb;' .
        'Uid=Admin;Pwd=;');
// this is our test case
$city = 'Montréal';
echo '$city: ' . $city . "'r'n";
// this is the UTF-8 "token" we'd get from the AJAX call...
$token = utf8_encode($city);
echo '$token: ' . $token . "'r'n";
// ...and here we convert to the Access_2000 character set
$win_token = mb_convert_encoding($token, 'Windows-1252', 'UTF-8');
echo '$win_token: ' . $win_token . "'r'n";
$sth = $dbh->prepare('SELECT * FROM Cities WHERE City = ?');
$sth->Execute(array($win_token));
$rst = $sth->fetchAll();
echo '$rst: ';
print_r($rst);

从Windows命令行运行的结果:

C:'__tmp>'php'php odbcTest.php
$city: MontrΘal
$token: Montréal
$win_token: MontrΘal
$rst: Array
(
    [0] => Array
        (
            [City] => MontrΘal
            [0] => MontrΘal
        )
)
注意,命令行输出本身有点乱码,但至少SQL查询返回了一个结果....