我有两组csv文件,一组包含合同数据,另一组包含授予的合同。我需要使用公共字段(contractName)组合两个csv文件,并计算已关闭的授予合同的总额。csv文件的链接
到目前为止,我已经成功地将这两个csv文件合并在一起,并将其写入final.csv文件,但我无法使用公共字段(contractName)合并这两个csv文件。这是代码
<?php
$nn = 0;
foreach (glob("*.csv") as $filename) {
if (($handle = fopen($filename, "r")) !== FALSE) {
while (($data = fgetcsv($handle, 0, ",")) !== FALSE) {
$c = count($data);
for ($x=0;$x<$c;$x++)
{
$csvarray[$nn][] = $data[$x];
}
$nn++;
}
fclose($handle);
}
}
$fp = fopen('../final.csv', 'w');//output file set here
foreach ($csvarray as $fields) {
fputcsv($fp, $fields);
}
fclose($fp);?>
这是我的最终输出。
contractName,contractDate,completionDate,awardee,awardeeLocation,Amount
Contract-2070-3,5/9/14,8/25/14,"SK Builders",Banke,200000
Contract-2070-5,3/18/14,4/8/14,"S engineering industries",Makwanpur,300000
Contract-2070-9,3/6/14,4/6/14,"Gourishankar nirman sewa",Lalitpur,400000
Contract-2070-10,2/6/14,6/16/14,"SK Builders",Banke,500000
contractname,status,bidPurchaseDeadline,bidSubmissionDeadline,bidOpeningDate,tenderid,publicationDate,publishedIn
Contract-2070-1,Closed,6/12/14,6/13/14,6/13/14,2070/071/2,5/14/14,"Nagarik Daily"
Contract-2070-2,Closed,6/10/14,6/11/14,6/11/14,16/070/71,5/12/14,"The Himalayan Times"
Contract-2070-3,Current,3/8/14,3/9/14,3/9/14,DDC/Bag/Bridge/03-070/71,3/10/14,"Nagarik Daily"
Contract-2070-4,Current,4/23/14,4/25/14,4/25/14,04(2070/071),4/9/14,"Hetauda sandesh"
Contract-2070-5,Closed,4/23/14,4/25/14,4/26/14,04(2070/071),4/10/14,"Hetauda sandesh"
Contract-2070-6,Current,4/23/14,4/25/14,4/27/14,04(2070/071),4/11/14,"Hetauda sandesh"
Contract-2070-7,Current,4/23/14,4/25/14,4/28/14,04(2070/071),4/12/14,"Hetauda sandesh"
Contract-2070-8,Current,4/23/14,4/25/14,4/29/14,04(2070/071),4/13/14,"Hetauda sandesh"
Contract-2070-9,Closed,2/6/14,2/8/14,2/8/14,15/070/71,1/9/14,"The Himalayan Times"
Contract-2070-10,Current,1/14/14,1/15/14,1/16/14,"13,2070/2071",1/6/14,"The Himalayan Times"
但最终的输出应该是这样的。
contractname,status,bidPurchaseDeadline,bidSubmissionDeadline,bidOpeningDate,tenderid,publicationDate,publishedIn,contractDate,completionDate,awardee,awardeeLocation,Amount
Contract-2070-1,Closed,6/12/14,6/13/14,6/13/14,2070/071/2,5/14/14,Nagarik Daily,,,,,
Contract-2070-2,Closed,6/10/14,6/11/14,6/11/14,16/070/71,5/12/14,The Himalayan Times,,,,,
Contract-2070-3,Current,3/8/14,3/9/14,3/9/14,DDC/Bag/Bridge/03-070/71,3/10/14,Nagarik Daily,5/9/14,8/25/14,SK Builders,Banke,200000
Contract-2070-4,Current,4/23/14,4/25/14,4/25/14,04(2070/071),4/9/14,Hetauda sandesh,,,,,
Contract-2070-5,Closed,4/23/14,4/25/14,4/26/14,04(2070/071),4/10/14,Hetauda sandesh,3/18/14,4/8/14,S engineering industries,Makwanpur,300000
Contract-2070-6,Current,4/23/14,4/25/14,4/27/14,04(2070/071),4/11/14,Hetauda sandesh,,,,,
Contract-2070-7,Current,4/23/14,4/25/14,4/28/14,04(2070/071),4/12/14,Hetauda sandesh,,,,,
Contract-2070-8,Current,4/23/14,4/25/14,4/29/14,04(2070/071),4/13/14,Hetauda sandesh,,,,,
Contract-2070-9,Closed,2/6/14,2/8/14,2/8/14,15/070/71,1/9/14,The Himalayan Times,3/6/14,4/6/14,Gourishankar nirman sewa,Lalitpur,400000
Contract-2070-10,Current,1/14/14,1/15/14,1/16/14,"13, 2070/2071",1/6/14,The Himalayan Times,2/6/14,6/16/14,SK Builders,Banke,500000
这个问题并没有那么难,你总是可以把数据放在数组的csv中并使用它们,就像这个解决方案:
// 1st section
$fh = fopen('awards.csv', 'r');
$fhg = fopen('contracts.csv', 'r');
while (($data = fgetcsv($fh, 0, ",")) !== FALSE) {
$awards[]=$data;
}
while (($data = fgetcsv($fhg, 0, ",")) !== FALSE) {
$contracts[]=$data;
}
// 2nd section
for($x=0;$x< count($contracts);$x++)
{
if($x==0){
unset($awards[0][0]);
$line[$x]=array_merge($contracts[0],$awards[0]); //header
}
else{
$deadlook=0;
for($y=0;$y <= count($awards);$y++)
{
if($awards[$y][0] == $contracts[$x][0]){
unset($awards[$y][0]);
$line[$x]=array_merge($contracts[$x],$awards[$y]);
$deadlook=1;
}
}
if($deadlook==0)
$line[$x]=$contracts[$x];
}
}
// 3 section
$fp = fopen('final.csv', 'w');//output file set here
foreach ($line as $fields) {
fputcsv($fp, $fields);
}
fclose($fp);
我很难解释代码,因为我来自西班牙,所以我的英语不是很好。。。但是我可以试试
基本上代码有三个部分,
在第1节中,打开这2个文件,将内容放入数组$awards[]和$contracts[]中,因此$award[0]是awards.csv中的第一行,$award[1]是awards.com中的第二行,等等,在$contracts中也是如此。
在第2节中,
我比较了每个数组中的第一个单词$awards[x][0]和$contracts[x][0]。
如果($x==0),则第一个if
将生成报头。首先,我使用unset
函数删除了第一个单词contractname
,并使用array_merge
函数加入$awards[0]和$contracts[0]。
然后,使用这些for
,我从$contracts数组的每一行中选择第一个单词,并与$awards数组的每行中的第一个单词进行比较。因此,if($awards[$y][0] == $contracts[$x][0])
检查第一个单词(ej.Contract-2070-3are
)是否相同,如果是相同的字符串,则删除它并合并这些行。
如果这些单词不相同,请将$contracts[x]行保存在$line数组中,然后继续。
在第3节中,将$line数组中的内容保存到文件中。
这是某种面试问题吗?你需要展示写算法的能力还是以现实的方式解决问题的能力?
对于大数据集,我可能只会将csv转储到sqlitedb,每个csv一个表,然后用查询连接它们。
或者,您可以使用连接字段(contractName)作为两个数组的键,用每个csv、$contracts和$awards填充两个关联数组
然后循环键,并用给定键的每个数组的内容填充$final数组:
$final = array();
$keys = array_keys($contracts);
foreach($keys as $key) {
$final[] = array_merge($contracts[$key], $awards[$key]);
}