在PHP中运行密集的批处理进程,避免内存耗尽


Running an intensive batch process in PHP, and avoiding memory exhaustion

我有几千条记录(存储在MYSQL表中的一个表中),我需要对它们进行批处理。所有的记录都包含一个大的JSON。在某些情况下,JSON超过1MB(是的,我的数据库超过1GB)。

我有一个函数,抓取记录,解码JSON,改变一些数据,重新编码PHP数组回JSON,并将其保存回数据库。很简单。顺便说一下,这是在CakePHP应用程序的上下文中。

给定一个ID数组,我试图做这样的事情(非常简单的模拟代码):

foreach ($ids as $id) {
    $this->Model->id = $id;
    $data = $this->Model->read();
    $newData = processData($data);
    $this->Model->save($newData);
}

问题是,PHP很快就会耗尽内存。当像这样运行foreach时,就好像PHP从一条记录移动到下一条记录,而不释放前面操作所需的内存。

是否存在这样一种方式来运行循环,即在进入下一个循环迭代之前释放内存,以便我可以实际处理大量数据?

编辑:添加更多代码。该函数接受我的JSON,将其转换为PHP数组,执行一些操作(即,根据另一个数组中存在的内容重新配置数据),并替换原始数组中的值。JSON有很多层深度,因此foreach循环非常长。

function processData($theData) {
    $toConvert = json_decode($theData['Program']['data'], $assoc = true);
    foreach($toConvert['cycles'] as $cycle => $val) {
        foreach($toConvert['cycles'][$cycle]['days'] as $day => $val) {
            foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'] as $section => $val) {
                foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'] as $section => $val) {
                    foreach($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'] as $exercise => $val) {
                        if (isset($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder'])) {
                            $folderName = $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder']['folderName'];
                            if ( isset($newFolderList['Folders'][$folderName]) ) {
                                $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFolder'] = $newFolderList['Folders'][$folderName]['id'];
                            }
                        }
                        if (isset($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile'])) {
                            $fileName = basename($toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile']['fileURL']);
                            if ( isset($newFolderList['Exercises'][$fileName]) ) {
                                $toConvert['cycles'][$cycle]['days'][$day]['sections'][$section]['exercises'][$exercise]['selectedFile'] = $newFolderList['Exercises'][$fileName]['id'];
                            }
                        }
                    }
                }
            }
        }
    }
    return $toConvert;
}

Model->read()实际上只是告诉Cake从db中提取一条记录,并将其返回到一个数组中。幕后发生了很多事情,应该有更有见识的人来解释。

我要做的第一步是确保所有内容都是通过引用传递的。

,

foreach ($ids as $id) {
processData($data);
}
function processData(&$d){}
http://php.net/manual/en/language.references.pass.php