MYSQL/PHP -在值不变的情况下返回行平均值


MYSQL/PHP - Return average of rows while value doesn't change?

这篇文章的标题不是很有意义,所以我将尽力解释。

我在一个表中有一个非常大的数据集(1000行)。本表数据涉及车辆的GPS跟踪。当车辆静止不动(速度=0)时,经纬度在12小时内会发生相当大的变化。

我当前的SELECT查询是这样的:

$query = "SELECT UUID, UNITID, Truncate(LONGITUDE,6) AS LONGITUDE, Truncate(LATITUDE,6) AS LATITUDE, SPEED, TRACKINGTIME FROM trackpoint_au WHERE SPEED > -1 Order By UnitID, TRACKINGTIME";

查询通过PHP返回一个XML页面。如下所示:

header("Content-type: text/xml");
// Start XML file, echo parent node
echo '<markers>';
// Iterate through the rows, printing XML nodes for each
while ($row = @mysql_fetch_assoc($result)){ 
// ADD TO XML DOCUMENT NODE
echo '<marker ';
echo 'unitid="' . $row['UNITID'] . '" ';
echo 'lat="' . $row['LATITUDE'] . '" ';
echo 'lng="' . $row['LONGITUDE'] . '" ';
echo 'spd="' . $row['SPEED'] . '" ';
echo 'time="' . $row['TRACKINGTIME'] . '" ';
echo '/>';
}
// End XML file
echo '</markers>';

输出如下:

<marker unitid="7711010426" lat="-32.080402" lng="115.854890" spd="0" time="2011-11-30 06:15:00" />
<marker unitid="7711010426" lat="-32.080376" lng="115.854880" spd="0" time="2011-11-30 06:16:00" />
<marker unitid="7711010426" lat="-32.080364" lng="115.854880" spd="0" time="2011-11-30 06:17:00" />
<marker unitid="7711010426" lat="-32.080330" lng="115.854836" spd="0" time="2011-11-30 06:18:00" />
<marker unitid="7711010426" lat="-32.080326" lng="115.854860" spd="1.85" time="2011-11-30 06:20:00" />
<marker unitid="7711010426" lat="-32.080265" lng="115.854890" spd="0" time="2011-11-30 06:21:00" /> 
<marker unitid="7711010426" lat="-32.080276" lng="115.854920" spd="0" time="2011-11-30 06:22:00" /> 
<marker unitid="7711010426" lat="-32.080315" lng="115.854900" spd="0" time="2011-11-30 06:23:00" /> 
<marker unitid="7711010426" lat="-32.080296" lng="115.854866" spd="0" time="2011-11-30 06:24:00" />

我的问题是这样的:我如何使用PHP或MYSQL返回的行平均纬度/经度与spd=0?

我的结果数据应该是这样的:

<marker unitid="7711010426" lat="-32.080367" lng="115.8548715" spd="0" time="2011-11-30 06:18:00" />
<marker unitid="7711010426" lat="-32.080326" lng="115.854860" spd="1.85" time="2011-11-30 06:20:00" />
<marker unitid="7711010426" lat="-32.080288" lng="115.854894" spd="0" time="2011-11-30 06:24:00" />

注意,'average'行具有被求平均值的行的LAST时间戳。

我尝试使用"按速度分组"。然而,这不能做我需要的,因为它以相同的速度对所有记录进行分组,而不仅仅是那些值为0的记录。

编辑

按照macek建议的UUID分组没有帮助,因为UUID对于每一行都是唯一的。

<marker time="2011-11-30 06:15:00" spd="0" lng="115.854890" lat="-32.080402" unitid="7711010426" uuid="c6d50454-aa5b-4069-8756-72c787923173"/>
<marker time="2011-11-30 06:16:00" spd="0" lng="115.854880" lat="-32.080376" unitid="7711010426" uuid="be6f9052-ab00-430a-8cec-6abf5051cad1"/>

在张贴问题和阅读下面的一些答案后,我设法把这个PHP代码放在一起。它循环遍历所有行,检查速度,如果速度为0,检查下一行(直到speed<>0)并计算这些点的lat/lng的平均值。

for($i=0;$i<$num;$i++){
    mysql_data_seek($result,$i); 
    $row = mysql_fetch_assoc($result); 
    if ($row['SPEED']==0){
    //echo $i . ' spd: '.$row['SPEED'] . '<br />';
    $spd0 = true;
    $counter = 1;
    $lat = $row['LATITUDE'];
    $lng = $row['LONGITUDE'];
    $i++;
    while (($spd0==true) && ($i<$num)){
        //echo ' + ' . $i;
        mysql_data_seek($result,$i); 
        $row2 = mysql_fetch_assoc($result);
        if (($row2['UNITID']==$row['UNITID']) && ($row2['SPEED']==0)){
            $counter++;
            $lat = $lat + $row2['LATITUDE'];
            $lng = $lng + $row2['LONGITUDE'];
            //echo $i . ' spd: '.$row2['SPEED'] . '<br />';
            $i++;
        }
        else{
            $spd0=false;
            $i--;
        }
    }
    $lat = $lat/$counter;
    $lng = $lng/$counter;
    // ADD TO XML DOCUMENT NODE
    echo '<marker ';
    echo 'uuid ="' . $row['UUID'] . '" ';
    echo 'unitid="' . $row['UNITID'] . '" ';
    echo 'lat="' . $lat . '" ';
    echo 'lng="' . $lng . '" ';
    echo 'spd="' . $row['SPEED'] . '" ';
    echo 'time="' . $row['TRACKINGTIME'] . '" ';
    echo '/>';
}
else {
    //echo $i;
    // ADD TO XML DOCUMENT NODE
    echo '<marker ';
    echo 'uuid ="' . $row['UUID'] . '" ';
    echo 'unitid="' . $row['UNITID'] . '" ';
    echo 'lat="' . $row['LATITUDE'] . '" ';
    echo 'lng="' . $row['LONGITUDE'] . '" ';
    echo 'spd="' . $row['SPEED'] . '" ';
    echo 'time="' . $row['TRACKINGTIME'] . '" ';
    echo '/>';
}
} 

如果有人有一个更优雅的方式来检查下一行,请发布它,因为总是在寻找方法来改进我的代码。

谢谢!

除了GROUP BYAVG()的正常使用之外,您可能对Quassnoi对我这里的问题的回答感兴趣:

SQL中连续行的GROUP BY

他发布了一个非常好的解决方案,在许多行中也表现得很好。

把速度想象成一种状态,并且您想要聚合在一段时间内具有相同速度的所有连续行。

下面是我尝试用这个方法重写你的查询:

SELECT 
        UNITID,
        /* we aggregate multiple rows, maybe you want to know which ones..
           this one is optional */
        CAST(GROUP_CONCAT(UUID SEPARATOR ', ') AS CHAR) AS UUIDS, 
        /* is group field in the inner subquery, we can just use it 
           in our select without an aggregate function */
        SPEED, 
        /* very important to select the lowest timestamp - 
           this is the time when your unit has stopped moving ..
           first row with speed=0 */
        MIN(TRACKINGTIME) AS TRACKINGTIME, 
        /* we calc the average on latitude here */
        TRUNCATE(AVG(LATITUDE),6) AS LATITUDE, 
        /* same for longitude */
        TRUNCATE(AVG(LONGITUDE),6) AS LONGITUDE, 
        /* maybe you want to know how many rows with speed 0 
           are grouped together */
        COUNT(UUID) AS AGGREGATE_COUNT 
FROM    (
        SELECT
                /* this increases the counter variable @r each time
                   the state has changed.. when speed of the previous row
                   was also "0" and is "0" in the current row, 
                   the counter is not increased. -- this is a virtual field 
                   we will use for GROUPing.
                   @speed is used to remember the speed of the previous
                   row for comparison in @r to determine if the speed has changed
                */
                @r := @r + (@prev_unit != UNITID 
                              OR @prev_speed != 0 
                              OR SPEED != 0) AS gn,  
                @prev_speed := SPEED AS a_speed,
                @prev_unit := UNITID AS a_unit,
                tp.*
        FROM    (
                SELECT  @r := 0,
                        @prev_speed := 1,
                        @prev_unit := ''
                ) vars,
                trackpoint_au tp
        ORDER BY
                UNITID, TRACKINGTIME
        ) q
GROUP BY
        gn
ORDER BY
        UNITID
测试数据:

CREATE TABLE `trackpoint_au` (
 `uuid` int(11) NOT NULL AUTO_INCREMENT,
 `latitude` decimal(10,0) NOT NULL,
 `longitude` decimal(10,0) NOT NULL,
 `speed` int(11) NOT NULL,
 `unitid` int(11) NOT NULL,
 `trackingtime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
 PRIMARY KEY (`uuid`)
) ENGINE=MyISAM;
INSERT INTO trackpoint_au (unitid, speed, latitude, longitude, trackingtime) VALUES
(1, 0, 10, 10, NOW()),
(1, 0, 20, 20, NOW()),
(1, 1, 10, 10, NOW()),
(1, 0, 10, 10, NOW()),
(1, 0, 30, 30, NOW()),
(2, 0, 10, 10, NOW()),
(2, 0, 20, 20, NOW()),
(3, 1, 10, 10, NOW()),
(4, 0, 10, 10, NOW()),
(4, 0, 20, 20, NOW()),
(4, 1, 30, 30, NOW()),
(4, 0, 60, 60, NOW()),
(4, 0, 60, 60, NOW());
结果:

+--------+--------+-------+---------------------+-----------+-----------+-----------------+
| UNITID | UUIDS  | SPEED | TRACKINGTIME        | LATITUDE  | LONGITUDE | AGGREGATE_COUNT |
+--------+--------+-------+---------------------+-----------+-----------+-----------------+
|      1 | 2, 1   |     0 | 2011-12-05 09:34:13 | 15.000000 | 15.000000 |               2 |
|      1 | 3      |     1 | 2011-12-05 09:34:13 | 10.000000 | 10.000000 |               1 |
|      1 | 4, 5   |     0 | 2011-12-05 09:34:13 | 20.000000 | 20.000000 |               2 |
|      2 | 6, 7   |     0 | 2011-12-05 09:34:13 | 15.000000 | 15.000000 |               2 |
|      3 | 8      |     1 | 2011-12-05 09:34:13 | 10.000000 | 10.000000 |               1 |
|      4 | 9, 10  |     0 | 2011-12-05 09:34:13 | 15.000000 | 15.000000 |               2 |
|      4 | 11     |     1 | 2011-12-05 09:34:13 | 30.000000 | 30.000000 |               1 |
|      4 | 12, 13 |     0 | 2011-12-05 09:34:13 | 60.000000 | 60.000000 |               2 |
+--------+--------+-------+---------------------+-----------+-----------+-----------------+
while ($row = @mysql_fetch_assoc($result)){ 
    if( $row['SPEED']!=0){
        echo 'list average';
        clear list;
        // ADD TO XML DOCUMENT NODE
        echo '<marker ';
        echo 'unitid="' . $row['UNITID'] . '" ';
        echo 'lat="' . $row['LATITUDE'] . '" ';
        echo 'lng="' . $row['LONGITUDE'] . '" ';
        echo 'spd="' . $row['SPEED'] . '" ';
        echo 'time="' . $row['TRACKINGTIME'] . '" ';
        echo '/>';
    } else {
        //put data to a list 
    }
}

这应该会让你走上正确的轨道

-- get average lat/lng for each unitid where speed is 0
select uuid, unitid, avg(lat), avg(lng)
from trackpoint_au
where speed=0
group by uuid, unitid

当使用group by时,任何未使用复合函数的选定字段都应添加到group by语句中

编辑

uuid添加到SELECT和GROUP BY

我宁愿提出不同的建议。如果当速度= 0时,Lat和Long的变化不显著,为什么要尝试得到平均值?对于这些值,您可以截断最后一位数字或四舍五入5或10。

这并不会减少行数。但是,您应该能够为Lat和Long获得一致的值。