文章/答案/技术大牛

发布

社区首页 >问答首页 >在MongoDB中获取文档子集有多有效？

问在MongoDB中获取文档子集有多有效？
EN

Stack Overflow用户

提问于 2012-12-20 17:31:54

回答 2查看 1K关注 0票数 3

如果我们有一个集合photos，并且每个条目都是一个大文档，其中包含有关照片的所有信息，包括视图、详细信息和详细的赞成票/反对票。

{
_id:ObjectId('...'),
title:'...',
location:'...',
views:[
    {...},
    {...},
    ...
    ],
upvotes:[
    {...},
    {...},
    ...
    ],
downvotes:[
    {...},
    {...},
    ...
    ],
}

哪个查询工作更快、效率更高(内存、CPU使用率)：

db.photos.find().limit(100)

或

db.photos.find({}, {views:0,upvotes:0,downvotes:0}).limit(100)

performance

mongodb

回答 2

Stack Overflow用户

回答已采纳

发布于 2012-12-20 17:38:56

你可以自己做。只需在查询末尾添加explain()即可。

例如：

db.photos.find().limit(100).explain()


{
  "cursor" : "<Cursor Type and Index>",
  "isMultiKey" : <boolean>,
  "n" : <num>,
  "nscannedObjects" : <num>,
  "nscanned" : <num>,
  "nscannedObjectsAllPlans" : <num>,
  "nscannedAllPlans" : <num>,
  "scanAndOrder" : <boolean>,
  "indexOnly" : <boolean>,
  "nYields" : <num>,
  "nChunkSkips" : <num>,
  "millis" : <num>,
  "indexBounds" : { <index bounds> },
  "allPlans" : [
                 { "cursor" : "<Cursor Type and Index>",
                   "n" : <num>,
                   "nscannedObjects" : <num>,
                   "nscanned" : <num>,
                   "indexBounds" : { <index bounds> }
                 },
                  ...
               ],
  "oldPlan" : {
                "cursor" : "<Cursor Type and Index>",
                "indexBounds" : { <index bounds> }
              }
  "server" : "<host:port>",
}

磨坊参数就是你想要的

如果您想查看cpu使用率，只需在启动mongod脚本中添加--cpu键即可。

--cpu
Forces mongod to report the percentage of CPU time in write lock. mongod generates output every four seconds. MongoDB writes this data to standard output or the logfile if using the logpath option.

http://docs.mongodb.org/manual/reference/explain/

对于hint() projection() smth，您可以为mongo提供：

我们有简单的集合：

> db.performance.findOne()
{
        "_id" : ObjectId("50d2e4c08861fdb7e1c601ea"),
        "a" : 1,
        "b" : 1,
        "c" : 1,
        "d" : 1
}

它由23个元素组成：

> db.performance.count()
23

现在我们可以创建复合索引了：

> db.performance.ensureIndex({'c':1, 'd':1})

并为mongo提供了使用索引进行投影的提示。

> db.performance.find({'a':1}, {'c':1, 'd':1}).hint({'c':1, 'd':1}).explain()
{
        "cursor" : "BtreeCursor c_1_d_1",
        "isMultiKey" : false,
        "n" : 1,
        "nscannedObjects" : 23,
        "nscanned" : 23,
        "nscannedObjectsAllPlans" : 23,
        "nscannedAllPlans" : 23,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "indexBounds" : {
                "c" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ],
                "d" : [
                        [
                                {
                                        "$minElement" : 1
                                },
                                {
                                        "$maxElement" : 1
                                }
                        ]
                ]
        },
        "server" : ""
}
>

票数 3

Stack Overflow用户

发布于 2012-12-20 17:51:15

这个故事实际上有两个方面，应用程序和服务器。

在应用程序中，第二个将会更快。应用程序将不必解串行化BSON文档(CPU密集型)，然后存储不需要的数据的散列(存储器密集型)。

在服务器端，在执行getMore操作之前，MongoDB可以发送更多的数据，允许每个游标进行更多的迭代，从而提高性能。不仅如此，当然，您发送的数据也更少。对于内存和CPU来说，getMore操作本身实际上都是资源密集型的，所以这是一种节省资源的方法。

至于在服务器本身内部，投影的成本很小，但它将比整个服务器的成本更小。

编辑

正如其他人所说，MongoDB实际上使用投影来操作结果集，因此您将在两个查询之间拥有相同的工作集。

编辑

这是在投影上使用索引的结果：

> db.g.insert({a:1,b:1,c:1,d:1})
> db.g.ensureIndex({ a:1,b:1,c:1 })
> db.g.find({}, {a:0,b:0,c:0}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 3,
        "nscannedObjects" : 3,
        "n" : 3,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}
> db.g.find({}, {a:1,b:1,c:1}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 3,
        "nscannedObjects" : 3,
        "n" : 3,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

这也是不使用projection的结果：

> db.g.find({}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 3,
        "nscannedObjects" : 3,
        "n" : 3,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

正如您所看到的，表示花费在文档上的时间的milis实际上在两者之间是相同的：0。因此，explain不是衡量这一点的好方法。

另一个编辑

排除_id不会应用覆盖的索引：

> db.g.find({}, {a:1,b:1,c:1,_id:0}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 3,
        "nscannedObjects" : 3,
        "n" : 3,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

又一次编辑

并且有300K行：

> db.g.find({}, {a:1,b:1,c:1}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 300003,
        "nscannedObjects" : 300003,
        "n" : 300003,
        "millis" : 95,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

> db.g.find({}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 300003,
        "nscannedObjects" : 300003,
        "n" : 300003,
        "millis" : 85,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

因此，在一个巨大的结果集上进行投影的成本确实更高，但请记住，这是在300K行上进行的投影……我的意思是WTF？谁，在他们正常的头脑中，会这样做呢？所以这部分的争论并不存在。无论哪种方式，在我的硬件上差异就像10ms，几乎只有你查询的1/ 10，因为这样的投影在这里不是你的问题。

我还应该注意到，--cpu标志不会给你想要的东西，对于初学者来说，它实际上关注的是写锁，其次是读。

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/13969001

复制

相似问题

问在MongoDB中获取文档子集有多有效？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在MongoDB中获取文档子集有多有效？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在MongoDB中获取文档子集有多有效？
EN