collection
、中的样本记录
(文件1)
[{
"_id": ObjectId("567941aaf0058ed6755ab3dc"),
"hash_count": NumberInt(7),
"time": [
NumberInt(1450787170),
NumberInt(1450787292),
NumberInt(1450787307),
NumberInt(1450787333),
NumberInt(1450787615)
],
"word": "batman"
},
(文件2)
{
"_id": ObjectId("567941aaf0058ed6755ab3dc"),
"hash_count": NumberInt(7),
"time": [
NumberInt(1450787170),
NumberInt(1450787292),
NumberInt(1450787307),
NumberInt(1450787333),
NumberInt(1450787354),
NumberInt(1450787526),
NumberInt(1450787615)
],
"word": "apple"
}]
已使用CCD_ 2存储,我想找出时间(1450787307
)和(1450787615
)之间的记录数
答案:
apple=5
batman=3
应该查询什么?
我运行了这个命令
{
aggregate : "hashtags",
pipeline:
[
{$match:{"time":{$gte:NumberInt(1450787307), $lte:NumberInt(1450787615)}}},
{$unwind:"$time"},
{$match:{"time":{$gte:NumberInt(1450787307), $lte:NumberInt(1450787615)}}},
{$group:{"_id":"$word","count":{$sum:1}}}
]
}
它给出了这个结果
Response from server:
{
"result": [
],
"ok": 1
}
由于您一直使用旧版本的mongoDB
,因此无法利用3.2
中引入的array aggregation operators
的功能。
您必须按照以下方式进行汇总:
db.collection.aggregate([
{$match:{"time":{$gte:NumberInt(1450787307), $lte:NumberInt(1450787615)}}},
{$unwind:"$time"},
{$match:{"time":{$gte:NumberInt(1450787307), $lte:NumberInt(1450787615)}}},
{$group:{"_id":"$word","count":{$sum:1}}}
])
翻译成PHP
、
$result = $c->aggregate([
[ '$match' => [ 'time' => [ '$gte' => NumberInt(1450787307),
'$lte' => NumberInt(1450787615) ] ] ],
[ '$unwind' => '$time' ],
[ '$match' => [ 'time' => [ '$gte' => NumberInt(1450787307),
'$lte' => NumberInt(1450787615) ] ] ],
[ '$group' => [ '_id' => '$word', 'count' => [ '$sum' => 1 ] ] ]
]);
在版本3.2
中,您可以使用$filter
和$size
的组合来获得相同的结果,并且操作成本更低。
db.collection.aggregate([
{$match:{"time":{$gte:NumberInt(1450787307),
$lte:NumberInt(1450787615)}}},
{$project:{"_id":0,"word":1,
"count":{$size:{$filter:
{"input":"$time",
"as":"t",
"cond":{$and:[
{$gte:["$$t",NumberInt(1450787307)]},
{$lte:["$$t",NumberInt(1450787615)]}]}
}
}
}
}}
])
好吧,经过多次尝试,我得出了这个答案,而且是正确的对于1450787615-下限1450855155-上限
db.hashtags.aggregate([
{
"$match": {
"time": {
"$gte": 1450787615, "$lte": 1450855155
}
}
},
{ "$unwind": "$time" },
{
"$match": {
"time": {
"$gte": 1450787615, "$lte": 1450855155
}
}
},
{
"$group": {
"_id": "$word",
"count": {
"$sum": 1
}
}
}
])
答案就像
{
"result" : [
{
"_id" : "batman",
"count" : 3
},
{
"_id" : "dear",
"count" : 1
},
{
"_id" : "ghost",
"count" : 1
}
],
"ok" : 1
}
db.collection.find({time:{$gt: 1450787307, $lt: 1450787615}});
这将首先为您提供一个适合您给定时间范围内的所有文档的光标。一旦你有了它,你就可以遍历光标,打印出名称和一些循环逻辑,以找到每一个的出现次数。我只使用过mongodb,所以可能有一种更有效的方法可以做到这一点。
参考:https://docs.mongodb.org/v3.0/reference/method/db.collection.find/