我有两个大数组:
- 数组A有4900个项目(每个项目是一个小数组)
- 数组B有700个项目(每个项目也是一个小数组)
基本上这些就是我的数组:
A = array (
[0] => array(
"name" => "KE-KE IMPEX ",
"email" => "someemai@gmail.com",
"kezbCompany" => "Fragrance",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
...
[4900] => array(
"name" => "Jane Doe",
"email" => "zzer@sad.com",
"kezbCompany" => "sadsad",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
)
B = array (
[0] => array(
"name" => "KE-KE IMPEX 46554 sda",
"email" => "xxx@gmail.com",
"kezbCompany" => "546wer",
"startDate" => "2013-03-25 00:00:00",
"endDate" => "2014-03-25 00:00:00",
"companyBase" => "06 20 232 2534"
)
...
[700] => array(
"name" => "45 Jane Doe",
"email" => "kekeimpex@gmail.com",
"kezbCompany" => "asd",
"startDate" => "2013-03-25 00:00:00"
)
)
小物件看起来像这样(A和B摊位):
array(
'name' => 'John Doe',
'email' => 'john@doe.com'
)
那么我需要做的是:检查哪个小数组有相同的名字
但是请记住,大多数情况下,这两个小数组的结构不会相同。
例如,他们的电子邮件是不同的。现在,如果我先循环A,然后再循环B,这会花费很多时间
这是我当前的代码:
$szData = file_get_contents('szData.txt');
$kData = file_get_contents('kData.txt');
$A = json_decode($szData);
$B = json_decode($kData);
$foundNr = 0;
foreach ($A as $key => $sz)
{
$cName = $sz->companyName;
foreach ($B as $index => $k)
{
$pattern = '/^(.*)+('.$cName.')/i';
echo "SzSor: " . $key . " --- Ksor: " . $index . "</br>";
if (preg_match($pattern, $k->companyName))
{
$founData[] = $k->companyName;
++$foundNr;
}
}
}
任何想法?
一种解决方案是首先将子数组中的所有数据合并为内爆字符串(John Doe-john@doe.com
),然后使用快速函数比较匹配的字符串:
$szData = file_get_contents('szData.txt');
$kData = file_get_contents('kData.txt');
$A = json_decode($szData);
$B = json_decode($kData);
$foundNr = 0;
$B_sample = array();
foreach ($B as $index => $data){
$B_sample[$index] = implode('-',$data);
//or use some other custom procedure to create unique string from the data.
}
$B_sample = array();
foreach ($A as $index => $data){
$A_sample[$index] = implode('-',$data);
}
$A_B_intersect = array_intersect ($A_sample , $B_sample ); // the KEYS in the result array are from $A
$foundNr = count($A_B_intersect);
foreach($A_B_intersect as $key->$data){
$founData[] = $A[$key]->companyName;
}
这有额外的好处,如果你可以创建$A_sample
,同时构建$A
,重用相同的循环。
您最好尽量避免使用自己的循环之类的东西,而尽量使用内置的PHP函数,如:
in_array:
摘自docs的代码:
<?php
$os = array("Mac", "NT", "Irix", "Linux");
if (in_array("Irix", $os)) {
echo "Got Irix";
}
if (in_array("mac", $os)) {
echo "Got mac";
}
?>
The second condition fails because in_array() is case-sensitive, so the program above will display:
Got Irix
Example #2 in_array() with strict example
<?php
$a = array('1.10', 12.4, 1.13);
if (in_array('12.4', $a, true)) {
echo "'12.4' found with strict check'n";
}
if (in_array(1.13, $a, true)) {
echo "1.13 found with strict check'n";
}
?>
The above example will output:
1.13 found with strict check
因此,虽然您可能仍然需要循环遍历一个数组来比较所有可能的值,但最有效的代码通常来自使用内置函数。
说了这么多,写几个简单的循环并测试一下,看看哪个性能最好。如果您知道数据的结构,您可能能够完全跳过许多元素—例如,如果您知道数据将以字母顺序出现,并且您正在寻找以"S"开头的内容,则执行一次跳过100条记录的循环,直到获得"S"值—然后返回到它之前的最后一个条目。诸如此类。这是一种使代码快速高效的思维方式——如果您可以从100个元素中跳过99个元素,在5000个元素中的前3500个元素中不执行逐个元素的搜索。