如何改进和加速这个使用嵌套循环的 PHP 脚本


How to Improve and speed up this PHP script that uses a nested loop?

我编写了一个PHP脚本,该脚本连接两个表以以下格式显示信息:

Jane Doe
Phone: 082 980 9514
Home Loan Applications (Active) - 17/07/2013
Credit Report (Free Report) (Unsubscribed) 12/06/2013

您会注意到名字和姓氏位于输出的第一行,后跟电话号码,然后是他们订阅的邮件列表。列表旁边是状态。活动意味着用户尚未取消订阅,并且在如果用户已选择退出邮件列表时显示取消订阅日期时仍处于活动状态。然后是某个邮件列表的订阅者的注册日期。

下面是示例表的链接:

它按预期工作,但需要永远完成,因为一个表中大约有 76,000 条记录,另一个表中大约有 100,000 条记录。我想就如何优化代码以加快脚本速度提出建议。

这是我编写的当前代码:

$resultarray = array();
$rs4 = mysqli_query($con1,"SELECT interspire_customfield.subscriberid, interspire_customfield.fname, interspire_customfield.lname, interspire_customfield.phone, emailaddress, subscribedate, unsubscribed, interspire_customfield.listid, listname FROM `interspire_subscriber` INNER JOIN `interspire_customfield` ON interspire_subscriber.subscriberid = interspire_customfield.subscriberid GROUP BY emailaddress");
while($row4 = mysqli_fetch_array($rs4)) {
        $resultarray[] = $row4['subscriberid'].",".$row4['fname'].",".$row4['lname'].",".$row4['phone'].",".$row4['emailaddress'];
}
foreach ($resultarray as $arrlist) {
    $arr = explode(',', $arrlist);
    $sid = $arr[0];
    $frstname = $arr[1];
    $lstname = $arr[2];
    $pnum = $arr[3];
    $emailadd = $arr[4];
    echo $frstname." ".$lstname."<br />";
    echo "Phone: ".$pnum."<br />";
    $rs5 = mysqli_query($con1,"SELECT interspire_customfield.subscriberid, subscribedate, unsubscribed, interspire_customfield.listid, listname FROM interspire_subscriber INNER JOIN interspire_customfield ON interspire_subscriber.subscriberid = interspire_customfield.subscriberid WHERE interspire_subscriber.emailaddress = '$emailadd' GROUP BY interspire_subscriber.listid");
    if (!$rs5) {
    printf("Error: %s'n", mysqli_error($con));
    exit();
}
    while($row5 = mysqli_fetch_array($rs5)) {
        $listname = $row5['listname'];
        $subdate = $row5['subscribedate'];
        $unsub = $row5['unsubscribed'];
        if($unsub == "0"){
            $stat = "Active";
        }else{
            $stat = date('d/m/Y', $unsub);
        }
        $subdt = date('d/m/Y', $subdate);
        echo "* $listname ($stat) - $subdt <br />";
    }
        echo "<br />";
}

如果两个表中都有索引(如果没有,请在两个字段中的订阅上创建索引),请尝试使用不带 JOIN 的查询:

SELECT 
ic.subscriberid, 
ic.fname, 
ic.lname, 
ic.phone, 
ic.listid, 
emailaddress, 
subscribedate, 
unsubscribed, 
listname 
FROM `interspire_subscriber` isub, `interspire_customfield` ic
WHERE isub.subscriberid = ic.subscriberid
GROUP BY emailaddress

并在打印前使用缓冲区存储字符串。您可以将字符串值或数组与 join() 方法一起使用,也可以只使用ob_start。

您的预期输出需要为每个电子邮件地址确定一组姓名和电话号码: 然而,由于interspire_subscriber将多个subscriberid值与单个emailaddress相关联,因此与interspire_customfield的连接(包含姓名和电话号码数据)不是 1:1; 因此,一个电子邮件地址可能与多个不同的名称和电话号码相关联。

必须按subscriberid对输出进行分组(这可能不能满足整理每个用户的列表订阅的所需结果),或者必须更改架构:任何此类更改都必须涉及一次性确定interspire_customfield中的哪些值应与每个emailaddress相关联。 然后,我建议在emailaddress上键入interspire_customfield(并从该表中删除listid并从其中一个表中删除subscriberid,具体取决于它是否应该与订阅者订阅相关联)。

进行这些更改后,您可以执行以下操作:

$dbh = new PDO("mysql:dbname=$dbname;charset=utf8", $username, $password);
$qry = $dbh->query('
  SELECT   emailaddress,
           c.fname, c.lname, c.phone,
           s.subscribedate, s.unsubscribed, s.listname
  FROM     interspire_subscriber  AS s
      JOIN interspire_customfield AS c USING (emailaddress)
  ORDER BY emailaddress, s.listid
');
if ($qry) {
  echo '<ol>';
  $row = $qry->fetch();
  while ($row) {
    $current_email = $row['emailaddress'];
    echo '<li>',
           htmlentities($row['fname']),' ',htmlentities($row['lname']),'<br/>',
           'Phone: ',htmlentities($row['phone']),
           '<ul>';
    do {
      $unsub = $row['unsubscribed'];
      echo '<li>',htmlentities($row['listname']),
           ' (',$unsub ? 'Active' : date('d/m/Y', $unsub),')',
           ' - ',date('d/m/Y', $row['subscribedate']),
           '</li>';
    } while ($row = $qry->fetch() and $row['emailaddress'] == $current_email);
    echo   '</ul>',
         '</li>';
  }
  echo '</ol>';
}

至于速度,你应该确保在(emailaddress, listid) interspire_subscriber表上定义了一个复合索引;在(emailaddress) interspire_customfield表上定义了UNIQUE索引。

当我们这样做时,您可能希望考虑是否需要interspire_subscriber.listname(它似乎是非规范化的,因为它应该与其他表中的listid相关联)? 您可能还希望考虑使用 MySQL 的本机时态数据类型而不是整数来存储时态值。