Skip to main content
 首页 » 编程设计

mysql之MongoDB 聚合查询与 MySQL SELECT field1 FROM 表

2024年02月13日18jillzhang

我对 MongoDB 完全陌生,想要比较 NoSQL 数据模型相对于关系数据库对应部分的查询性能。我将其写入 MongoDB shell

// Make 10 businesses 
// Each business has 10 locations 
// Each location has 10 departments 
// Each department has 10 teams 
// Each team has 100 employees 
(new Array(10)).fill(0).forEach(_=> 
    db.businesses.insert({ 
        "name":"Business Name", 
        "locations":(new Array(10)).fill(0).map(_=>({ 
            "name":"Office Location", 
            "departments":(new Array(10)).fill(0).map(_=>({ 
                "name":"Department", 
                "teams":(new Array(10)).fill(0).map(_=>({ 
                    "name":"Team Name", 
                    "employees":(new Array(100)).fill(0).map(_=>({ 
                        "age":Math.floor(Math.random()*100) 
                    })) 
                })) 
            })) 
        })) 
    }) 
); 

然后我尝试了相当于 MySQL 的 EXPLAIN SELECT age,name,(and a few other fields) FROM employees WHERE age >= 50 ORDER BY age DESC通过写这个声明:

db.businesses.aggregate([ 
    { $unwind: "$locations" }, 
    { $unwind: "$locations.departments" }, 
    { $unwind: "$locations.departments.teams" }, 
    { $unwind: "$locations.departments.teams.employees" }, 
    { $project: { _id: 0, age: "$locations.departments.teams.employees.age" } }, 
    { $match: { "age": { $gte: 50 }} }, 
    { $sort: {"age" : -1}} 
]).explain("executionStats") 

结果是:

"errmsg" : "Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in.",

所以我删除了排序子句并尝试获得 explain 。但结果是:

TypeError: db.businesses.aggregate(...).explain is not a function

所以我的问题是:

  1. 主要是想知道 SELECT age FROM employees WHERE age >= 50 ORDER BY age DESC 的性能差异与 MongoDB 的聚合查询对应部分相比。或多或少是一样的吗?其中一个会比另一个更快或更高效吗?

  2. 或者,如何修复 MongoDB 查询,以便获得性能详细信息以与 MySQL 查询对应部分进行比较?

请您参考如下方法:

员工是单一实体;因此,您可能不想在部门、地点和团队的丰富结构中如此深入地对团队成员的年龄进行建模。拥有一个单独的 employees 集合是完全可以的,只需执行以下操作即可:

db.businesses.aggregate([ 
{$match: {"age": {$gt: 50} }} 
,{$sort: {"age": -1} } 
]); 

在您的企业收藏深处,您可以拥有:

{ teams: [ {name: "T1", employees: [ "E1", "E34" ]} ] } 

或者,试试这个:

db.businesses.aggregate([ your pipeline] ,{allowDiskUse:true}); 

OP 的设置为 10 个业务 -> 10 个地点 -> 10 个部门 -> 10 个团队 -> 100 名雇员。前 3 次展开会产生 10000 倍的数据爆炸,但最后一次是 100 倍以上。我们可以使用 $filter 来缩小命中:

db.businesses.aggregate([ 
{ $unwind: "$locations" }, 
{ $unwind: "$locations.departments" }, 
{ $unwind: "$locations.departments.teams" }, 
 
{$project: { 
        XX: {$filter: { 
                    input: "$locations.departments.teams.employees", 
                    as: "z", 
                    cond: {$gte: [ "$$z.age", 50] } 
            }} 
    }} 
,{$unwind: "$XX"} 
,{$sort: {"XX.age":-1}}])