首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >什么MongoDB 8.0 无索引加速聚合操作? 疯了吧!

什么MongoDB 8.0 无索引加速聚合操作? 疯了吧!

作者头像
AustinDatabases
发布2026-06-15 10:14:05
发布2026-06-15 10:14:05
300
举报
文章被收录于专栏:AustinDatabasesAustinDatabases

有新的项目可能会大量使用MongoDB,基于最新的8.0 还没有熟悉,尤其我对其中的列式处理特别感兴趣,所以这些知识咱们的跟上,要不开发,架构问你,你还是老的MongoDB知识就坏了,尤其AI当道,架构也会问AI,所以卷的要死。

同时随着操作系统的版本升级,我也想看看LINUX 9 与 MONGODB8 的匹配度是如何的,所以此次就安装一个新的MongoDB 8.3 + LINUX 9 来看看匹配度。

在说安装的事情前,我们先确认一个核心的问题SBE 虽然SBE 在7.0就有了,但是8.0才是爆发的版本, slot based execution 此时大量的查询会默认进入 SBE查询引擎,这样的方式下,和以前的MONGODB的运行有一些区别,IO决定性能,而现在是 CPU+CACHE决定命中。

同时我们要注意到MOGNODB 已经不是原来的MONGODB了,他是集合了 column store ,vector seach ,time series ,aggregation ,change stream 为一体的数据库产品。

那么8.0的确是一个非常值得让DBA去学习的MOGNODB 的版本。

这里同样的在LINUX 中关闭大页内存,设置 vm.max_map_count 关闭 swappiness 修改文件句柄。

相关的命令我们在重复一下

代码语言:javascript
复制
[root@rocky9 bin]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
[root@rocky9 bin]# 
[root@rocky9 bin]# sudo grubby --update-kernel=ALL --args="transparent_hugepage=never"
[root@rocky9 bin]# 

同时将如下的配置文件写入到 /etc/sysctl.d/mongodb.confd  的文件中

# 增加内存映射区域的最大数量,MongoDB 建议至少 262144
vm.max_map_count = 262144

# 将 swappiness 设为 1,尽量使用物理内存而不触发 swap,但不完全关闭以防 OOM
vm.swappiness = 1

# 调整文件句柄系统级限制(可选,通常 systemd 限制更关键)
fs.file-max = 2097152

sysctl -p /etc/sysctl.d/mongodb.confd

ulimit -n 64000

MongoDB 8.0 的单机参数

代码语言:javascript
复制
storage:
  dbPath: /data/mongodb/data

  wiredTiger:
    engineConfig:
      cacheSizeGB: 16

    collectionConfig:
      blockCompressor: zstd

systemLog:
  destination: file
  logAppend: true
  path: /data/mongodb/log/mongod.log

net:
  bindIp: 0.0.0.0
  port: 27017
  maxIncomingConnections: 65536

processManagement:
  fork: true
  pidFilePath: /data/mongodb/run/mongod.pid

replication:
  replSetName: rs0
  oplogSizeMB: 20480

security:
  authorization: enabled
  keyFile: /data/mongodb/key/mongo.key
setParameter:
  flowControlTargetLagSeconds: 10
  ttlMonitorSleepSecs: 60
  cursorTimeoutMillis: 600000
  internalQueryFacetBufferSizeBytes: 268435456
  internalDocumentSourceGroupMaxMemoryBytes: 268435456

单机配置文件。

这里就安装一个单机版本,今天准备测试MONGODB8.0的时间序列的核心能力是否如说明书所说,十分的强悍。

MongoDB 7.0

MongoDB 8.0

写入时先存储未压缩数据,再由后台压缩,导致 高 I/O 和缓存压力。

写入时直接存储为 列式压缩格式,减少缓存占用,提升写入效率。

查询时逐条处理文档,长时间聚合性能较差。

引入 块处理(Block Processing),按数据块批量处理,、sort、$group 等操作速度提升可达 **200%+**。

高并发场景下容易出现性能波动。

更稳定的吞吐量,缓存使用降低 10-20 倍,写入性能提升 2-3 倍。

根据文档MongoDB8。0的数据库中时间序列已经不使用BSON的吮吸存储了,而是会同类字段聚合,时间连续数据打包,自动Bucket化,以及使用历史压缩等。

举例 原来的一段温度的数据存储用如下的方式

代码语言:javascript
复制
{temperature:25.1,humidity:40}
{temperature:25.2,humidity:41}
{temperature:25.3,humidity:42}

而现在存储的方式是

代码语言:javascript
复制
{
  "temperature": [25.1,25.2,25.3],
  "humidity": [40,41,42]
}

这样的好处是,磁盘占用空间降低,IO操作降低,缓存命中率提高,聚合分析更快,以及数据scan更小。

如果要使用时序类型,必须定义timeField 字段的类型必须是BSON日期类型,其实这里指的就是 ISODate 这里要求每条数据的时间戳必须有效,否则无法写入时序集合。

同时我们必须要用 metaField来对时序数据来区分数据源,通过metaField创建后的字段是不能在进行添加活修改的,这就要求使用者有定义这个字段的能力,要求对业务非常熟悉。

而这里的目的也是显而易见的列式存储,要的就是分类,插入一条数据就要如列式的桶,

输入数据的顺序性,输入的数据尽量要有序,不要乱序的嗯写入,虽然MongoDB 能处理但是会增加入桶的开销。

最后一点MONGODB 时序集合,只要按照上述额要求,无需任何额外配置,直接就是列式存储

实例 建立数据库,插入数据

代码语言:javascript
复制
admin> use iot_platform
switched to db iot_platform
iot_platform> 

iot_platform> for(let i=0;i<100000;i++){
| 
|  db.sensor_data.insertOne({
| 
|    timestamp:new Date(
|       Date.now()-(i*1000)
|    ),
| 
|    meta:{
|       sensorId:"A"+(i%100),
|       location:"factory_"+(i%5),
|       deviceType:"iot"
|    },
| 
|    temperature:20+Math.random()*15,
|    humidity:30+Math.random()*40,
|    pressure:1000+Math.random()*50,
|    ai_score:Math.random()
|  })
| }
{
  acknowledged: true,
  insertedId: ObjectId('69fdaca0d52229d0139f7f42')
}

iot_platform> db.sensor_data.countDocuments({
| 
|  timestamp:{
|    $gte:new Date(
|       Date.now()-7200*1000
|    )
|  }
| 
| })
1905
iot> db.sensor_data.explain("executionStats").aggregate([
|   {
|     $match: {
|       timestamp: { $gte: new Date(Date.now() - 7200 * 1000) }
|     }
|   },
|   { $count: "total_count" }
| ])
{
  explainVersion: '2',
  stages: [
    {
      '$cursor': {
        queryPlanner: {
          namespace: 'iot.system.buckets.sensor_data',
          parsedQuery: {
            '$and': [
              { _id: { '$gte': ObjectId('69fdad870000000000000000') } },
              {
                'control.max.timestamp': {
                  '$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
                }
              },
              {
                'control.min.timestamp': {
                  '$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
                }
              }
            ]
          },
          indexFilterSet: false,
          queryHash: '177847C6',
          planCacheShapeHash: '177847C6',
          planCacheKey: '15A033E5',
          optimizationTimeMillis: 0,
          cursorType: 'regular',
          maxIndexedOrSolutionsReached: false,
          maxIndexedAndSolutionsReached: false,
          maxScansToExplodeReached: false,
          prunedSimilarIndexes: false,
          winningPlan: {
            isCached: false,
            queryPlan: {
              stage: 'GROUP',
              planNodeId: 3,
              inputStage: {
                stage: 'UNPACK_TS_BUCKET',
                planNodeId: 2,
                include: [ 'timestamp' ],
                computedMetaProjFields: [],
                includeMeta: false,
                eventFilter: {
                  timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
                },
                wholeBucketFilter: {
                  'control.min.timestamp': { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
                },
                inputStage: {
                  stage: 'CLUSTERED_IXSCAN',
                  planNodeId: 1,
                  filter: {
                    '$and': [
                      {
                        'control.max.timestamp': {
                          '$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
                        }
                      },
                      {
                        'control.min.timestamp': {
                          '$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
                        }
                      }
                    ]
                  },
                  nss: 'iot.system.buckets.sensor_data',
                  direction: 'forward',
                  minRecord: ObjectId('69fdad870000000000000000'),
                  maxRecord: ObjectId('ffffffffffffffffffffffff')
                }
              }
            },
            slotBasedPlan: {
              slots: '$$RESULT=s17 env: { s1 = RecordId(6469fdad870000000000000000), s2 = RecordId(64ffffffffffffffffffffffff), s9 = Nothing (nothing) }',
              stages: '[3] project [s17 = newBsonObj("_id", s14, "total_count", s16)] \n' +
                '[3] project [s16 = (convert ( s15, int32) ?: s15)] \n' +
                '[3] block_to_row blocks[s12, s13] row[s14, s15] s8 \n' +
                '[3] block_group bitset = s8 [s12] [s13 = valueBlockAggCount(s10)] [s13 = count()] [] [] spillSlots[s11] mergingExprs[sum(s11)] \n' +
                '[3] project [s12 = null] \n' +
                '[2] filter {!(valueBlockNone(s8, true))} \n' +
                '[2] project [s8 = valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7))] \n' +
                '[2] ts_bucket_to_cellblock s3 pathReqs[s6 = ProjectPath(Get(timestamp)/Id), s7 = FilterPath(Get(timestamp)/Traverse/Id)] bitmap = s5 \n' +
                '[1] filter {(\n' +
                '    let [\n' +
                '        l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
                '    ] \n' +
                '    in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
                '&& \n' +
                '    let [\n' +
                '        l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
                '    ] \n' +
                '    in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
                ')} \n' +
                '[1] scan s1 = minRecordId, s2 = maxRecordId [s3 = record, s4 = recordId] @"af54ab6f-96c5-49ba-9cdf-c1d05d46f2bb" forward '
            }
          },
          rejectedPlans: []
        },
        executionStats: {
          executionSuccess: true,
          nReturned: 1,
          executionTimeMillis: 9,
          totalKeysExamined: 0,
          totalDocsExamined: 5378,
          executionStages: {
            stage: 'project',
            planNodeId: 3,
            nReturned: 1,
            executionTimeMillisEstimate: 0,
            opens: 1,
            closes: 1,
            saveState: 2,
            restoreState: 1,
            isEOF: 1,
            projections: { '17': 'newBsonObj("_id", s14, "total_count", s16) ' },
            inputStage: {
              stage: 'project',
              planNodeId: 3,
              nReturned: 1,
              executionTimeMillisEstimate: 0,
              opens: 1,
              closes: 1,
              saveState: 2,
              restoreState: 1,
              isEOF: 1,
              projections: { '16': '(convert ( s15, int32) ?: s15) ' },
              inputStage: {
                stage: 'block_to_row',
                planNodeId: 3,
                nReturned: 1,
                executionTimeMillisEstimate: 0,
                opens: 1,
                closes: 1,
                saveState: 2,
                restoreState: 1,
                isEOF: 1,
                inputStage: {
                  stage: 'block_group',
                  planNodeId: 3,
                  nReturned: 1,
                  executionTimeMillisEstimate: 0,
                  opens: 1,
                  closes: 1,
                  saveState: 2,
                  restoreState: 1,
                  isEOF: 1,
                  groupBySlots: [ Long('12') ],
                  blockExpressions: { '13': 'valueBlockAggCount(s10) ' },
                  rowExpressions: { '13': 'count() ' },
                  initExprs: { '13': null },
                  blockDataInSlots: [],
                  accumulatorDataSlots: [],
                  mergingExprs: { '11': 'sum(s11) ' },
                  usedDisk: false,
                  spills: 0,
                  spilledBytes: 0,
                  spilledRecords: 0,
                  spilledDataStorageSize: 0,
                  blockAccumulations: 1762,
                  blockAccumulatorTotalCalls: 1762,
                  elementWiseAccumulations: 0,
                  peakTrackedMemBytes: 50,
                  inputStage: {
                    stage: 'project',
                    planNodeId: 3,
                    nReturned: 1762,
                    executionTimeMillisEstimate: 0,
                    opens: 1,
                    closes: 2,
                    saveState: 2,
                    restoreState: 1,
                    isEOF: 1,
                    projections: { '12': 'null ' },
                    inputStage: {
                      stage: 'filter',
                      planNodeId: 2,
                      nReturned: 1762,
                      executionTimeMillisEstimate: 0,
                      opens: 1,
                      closes: 2,
                      saveState: 2,
                      restoreState: 1,
                      isEOF: 1,
                      numTested: 1762,
                      filter: '!(valueBlockNone(s8, true)) ',
                      inputStage: {
                        stage: 'project',
                        planNodeId: 2,
                        nReturned: 1762,
                        executionTimeMillisEstimate: 0,
                        opens: 1,
                        closes: 1,
                        saveState: 2,
                        restoreState: 1,
                        isEOF: 1,
                        projections: {
                          '8': 'valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7)) '
                        },
                        inputStage: {
                          stage: 'ts_bucket_to_cellblock',
                          planNodeId: 2,
                          nReturned: 1762,
                          executionTimeMillisEstimate: 0,
                          opens: 1,
                          closes: 1,
                          saveState: 2,
                          restoreState: 1,
                          isEOF: 1,
                          numCellBlocksProduced: 3524,
                          numStorageBlocks: 1762,
                          numStorageBlocksDecompressed: 9,
                          inputStage: {
                            stage: 'filter',
                            planNodeId: 1,
                            nReturned: 1762,
                            executionTimeMillisEstimate: 0,
                            opens: 1,
                            closes: 1,
                            saveState: 2,
                            restoreState: 1,
                            isEOF: 1,
                            numTested: 5378,
                            filter: '(\n' +
                              '    let [\n' +
                              '        l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
                              '    ] \n' +
                              '    in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
                              '&& \n' +
                              '    let [\n' +
                              '        l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
                              '    ] \n' +
                              '    in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
                              ') ',
                            inputStage: {
                              stage: 'scan',
                              planNodeId: 1,
                              nReturned: 5378,
                              executionTimeMillisEstimate: 0,
                              opens: 1,
                              closes: 1,
                              saveState: 2,
                              restoreState: 1,
                              isEOF: 1,
                              numReads: 5378,
                              recordSlot: 3,
                              recordIdSlot: 4,
                              scanFieldNames: [],
                              scanFieldSlots: [],
                              minRecordIdSlot: 1,
                              maxRecordIdSlot: 2
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      nReturned: Long('1'),
      executionTimeMillisEstimate: Long('1')
    },
    {
      '$project': { total_count: true, _id: false },
      nReturned: Long('1'),
      executionTimeMillisEstimate: Long('1')
    }
  ],
  queryShapeHash: 'B170065DFBDD48DD141D52B5A33EE73471B06CC9FC6FE4F85B8FC0EA150664BB',
  peakTrackedMemBytes: Long('50'),
  serverInfo: {
    host: 'rocky9',
    port: 27017,
    version: '8.3.1',
    gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
  },
  serverParameters: {
    internalQueryFacetBufferSizeBytes: 268435456,
    internalDocumentSourceGroupMaxMemoryBytes: 268435456,
    internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
    internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
    internalQueryFacetMaxOutputDocSizeBytes: 104857600,
    internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
    internalQueryProhibitBlockingMergeOnMongoS: 0,
    internalQueryMaxAddToSetBytes: 104857600,
    internalQueryFrameworkControl: 'trySbeRestricted',
    internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
  },
command: {
    aggregate: 'system.buckets.sensor_data',
    pipeline: [
      {
        '$_internalUnpackBucket': {
          timeField: 'timestamp',
          metaField: 'meta',
          bucketMaxSpanSeconds: 3600,
          assumeNoMixedSchemaData: true,
          usesExtendedRange: false,
          fixedBuckets: false
        }
      },
      {
        '$match': { timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') } }
      },
      { '$count': 'total_count' }
    ],
    cursor: {},
    collation: { locale: 'simple' }
  },
  ok: 1
}
iot> 

iot> db.sensor_data.explain("executionStats").aggregate([
|   {
|     $match: {
|       timestamp: { $gte: new Date(Date.now() - 7200 * 1000) }
|     }
|   },
|   { $count: "total_count" }
| ])
{
  explainVersion: '2',
  stages: [
    {
      '$cursor': {
        queryPlanner: {
          namespace: 'iot.system.buckets.sensor_data',
          parsedQuery: {
            '$and': [
              { _id: { '$gte': ObjectId('69fdad870000000000000000') } },
              {
                'control.max.timestamp': {
                  '$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
                }
              },
              {
                'control.min.timestamp': {
                  '$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
                }
              }
            ]
          },
          indexFilterSet: false,
          queryHash: '177847C6',
          planCacheShapeHash: '177847C6',
          planCacheKey: '15A033E5',
          optimizationTimeMillis: 0,
          cursorType: 'regular',
          maxIndexedOrSolutionsReached: false,
          maxIndexedAndSolutionsReached: false,
          maxScansToExplodeReached: false,
          prunedSimilarIndexes: false,
          winningPlan: {
            isCached: false,
            queryPlan: {
              stage: 'GROUP',
              planNodeId: 3,
              inputStage: {
                stage: 'UNPACK_TS_BUCKET',
                planNodeId: 2,
                include: [ 'timestamp' ],
                computedMetaProjFields: [],
                includeMeta: false,
                eventFilter: {
                  timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
                },
                wholeBucketFilter: {
                  'control.min.timestamp': { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
                },
                inputStage: {
                  stage: 'CLUSTERED_IXSCAN',
                  planNodeId: 1,
                  filter: {
                    '$and': [
                      {
                        'control.max.timestamp': {
                          '$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
                        }
                      },
                      {
                        'control.min.timestamp': {
                          '$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
                        }
                      }
                    ]
                  },
                  nss: 'iot.system.buckets.sensor_data',
                  direction: 'forward',
                  minRecord: ObjectId('69fdad870000000000000000'),
                  maxRecord: ObjectId('ffffffffffffffffffffffff')
                }
              }
            },
            slotBasedPlan: {
              slots: '$$RESULT=s17 env: { s1 = RecordId(6469fdad870000000000000000), s2 = RecordId(64ffffffffffffffffffffffff), s9 = Nothing (nothing) }',
              stages: '[3] project [s17 = newBsonObj("_id", s14, "total_count", s16)] \n' +
                '[3] project [s16 = (convert ( s15, int32) ?: s15)] \n' +
                '[3] block_to_row blocks[s12, s13] row[s14, s15] s8 \n' +
                '[3] block_group bitset = s8 [s12] [s13 = valueBlockAggCount(s10)] [s13 = count()] [] [] spillSlots[s11] mergingExprs[sum(s11)] \n' +
                '[3] project [s12 = null] \n' +
                '[2] filter {!(valueBlockNone(s8, true))} \n' +
                '[2] project [s8 = valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7))] \n' +
                '[2] ts_bucket_to_cellblock s3 pathReqs[s6 = ProjectPath(Get(timestamp)/Id), s7 = FilterPath(Get(timestamp)/Traverse/Id)] bitmap = s5 \n' +
                '[1] filter {(\n' +
                '    let [\n' +
                '        l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
                '    ] \n' +
                '    in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
                '&& \n' +
                '    let [\n' +
                '        l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
                '    ] \n' +
                '    in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
                ')} \n' +
                '[1] scan s1 = minRecordId, s2 = maxRecordId [s3 = record, s4 = recordId] @"af54ab6f-96c5-49ba-9cdf-c1d05d46f2bb" forward '
            }
          },
          rejectedPlans: []
        },
        executionStats: {
          executionSuccess: true,
          nReturned: 1,
          executionTimeMillis: 9,
          totalKeysExamined: 0,
          totalDocsExamined: 5378,
          executionStages: {
            stage: 'project',
            planNodeId: 3,
            nReturned: 1,
            executionTimeMillisEstimate: 0,
            opens: 1,
            closes: 1,
            saveState: 2,
            restoreState: 1,
            isEOF: 1,
            projections: { '17': 'newBsonObj("_id", s14, "total_count", s16) ' },
            inputStage: {
              stage: 'project',
              planNodeId: 3,
              nReturned: 1,
              executionTimeMillisEstimate: 0,
              opens: 1,
              closes: 1,
              saveState: 2,
              restoreState: 1,
              isEOF: 1,
              projections: { '16': '(convert ( s15, int32) ?: s15) ' },
              inputStage: {
                stage: 'block_to_row',
                planNodeId: 3,
                nReturned: 1,
                executionTimeMillisEstimate: 0,
                opens: 1,
                closes: 1,
                saveState: 2,
                restoreState: 1,
                isEOF: 1,
                inputStage: {
                  stage: 'block_group',
                  planNodeId: 3,
                  nReturned: 1,
                  executionTimeMillisEstimate: 0,
                  opens: 1,
                  closes: 1,
                  saveState: 2,
                  restoreState: 1,
                  isEOF: 1,
                  groupBySlots: [ Long('12') ],
                  blockExpressions: { '13': 'valueBlockAggCount(s10) ' },
                  rowExpressions: { '13': 'count() ' },
                  initExprs: { '13': null },
                  blockDataInSlots: [],
                  accumulatorDataSlots: [],
                  mergingExprs: { '11': 'sum(s11) ' },
                  usedDisk: false,
                  spills: 0,
                  spilledBytes: 0,
                  spilledRecords: 0,
                  spilledDataStorageSize: 0,
                  blockAccumulations: 1762,
                  blockAccumulatorTotalCalls: 1762,
                  elementWiseAccumulations: 0,
                  peakTrackedMemBytes: 50,
                  inputStage: {
                    stage: 'project',
                    planNodeId: 3,
                    nReturned: 1762,
                    executionTimeMillisEstimate: 0,
                    opens: 1,
                    closes: 2,
                    saveState: 2,
                    restoreState: 1,
                    isEOF: 1,
                    projections: { '12': 'null ' },
                    inputStage: {
                      stage: 'filter',
                      planNodeId: 2,
                      nReturned: 1762,
                      executionTimeMillisEstimate: 0,
                      opens: 1,
                      closes: 2,
                      saveState: 2,
                      restoreState: 1,
                      isEOF: 1,
                      numTested: 1762,
                      filter: '!(valueBlockNone(s8, true)) ',
                      inputStage: {
                        stage: 'project',
                        planNodeId: 2,
                        nReturned: 1762,
                        executionTimeMillisEstimate: 0,
                        opens: 1,
                        closes: 1,
                        saveState: 2,
                        restoreState: 1,
                        isEOF: 1,
                        projections: {
                          '8': 'valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7)) '
                        },
                        inputStage: {
                          stage: 'ts_bucket_to_cellblock',
                          planNodeId: 2,
                          nReturned: 1762,
                          executionTimeMillisEstimate: 0,
                          opens: 1,
                          closes: 1,
                          saveState: 2,
                          restoreState: 1,
                          isEOF: 1,
                          numCellBlocksProduced: 3524,
                          numStorageBlocks: 1762,
                          numStorageBlocksDecompressed: 9,
                          inputStage: {
                            stage: 'filter',
                            planNodeId: 1,
                            nReturned: 1762,
                            executionTimeMillisEstimate: 0,
                            opens: 1,
                            closes: 1,
                            saveState: 2,
                            restoreState: 1,
                            isEOF: 1,
                            numTested: 5378,
                            filter: '(\n' +
                              '    let [\n' +
                              '        l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
                              '    ] \n' +
                              '    in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
                              '&& \n' +
                              '    let [\n' +
                              '        l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
                              '    ] \n' +
                              '    in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
                              ') ',
                            inputStage: {
                              stage: 'scan',
                              planNodeId: 1,
                              nReturned: 5378,
                              executionTimeMillisEstimate: 0,
                              opens: 1,
                              closes: 1,
                              saveState: 2,
                              restoreState: 1,
                              isEOF: 1,
                              numReads: 5378,
                              recordSlot: 3,
                              recordIdSlot: 4,
                              scanFieldNames: [],
                              scanFieldSlots: [],
                              minRecordIdSlot: 1,
                              maxRecordIdSlot: 2
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      },
      nReturned: Long('1'),
      executionTimeMillisEstimate: Long('1')
    },
    {
      '$project': { total_count: true, _id: false },
      nReturned: Long('1'),
      executionTimeMillisEstimate: Long('1')
    }
  ],
  queryShapeHash: 'B170065DFBDD48DD141D52B5A33EE73471B06CC9FC6FE4F85B8FC0EA150664BB',
  peakTrackedMemBytes: Long('50'),
  serverInfo: {
    host: 'rocky9',
    port: 27017,
    version: '8.3.1',
    gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
  },
  serverParameters: {
    internalQueryFacetBufferSizeBytes: 268435456,
    internalDocumentSourceGroupMaxMemoryBytes: 268435456,
    internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
    internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
    internalQueryFacetMaxOutputDocSizeBytes: 104857600,
    internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
    internalQueryProhibitBlockingMergeOnMongoS: 0,
    internalQueryMaxAddToSetBytes: 104857600,
    internalQueryFrameworkControl: 'trySbeRestricted',
    internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
  },
command: {
    aggregate: 'system.buckets.sensor_data',
    pipeline: [
      {
        '$_internalUnpackBucket': {
          timeField: 'timestamp',
          metaField: 'meta',
          bucketMaxSpanSeconds: 3600,
          assumeNoMixedSchemaData: true,
          usesExtendedRange: false,
          fixedBuckets: false
        }
      },
      {
        '$match': { timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') } }
      },
      { '$count': 'total_count' }
    ],
    cursor: {},
    collation: { locale: 'simple' }
  },
  ok: 1
}
iot> 

iot_platform> 

下面我们看看这个部分的语句的执行计划

代码语言:javascript
复制
iot_platform> db.sensor_data.find({
| 
|  "meta.sensorId":"A1",
| 
|  timestamp:{
|    $gte:new Date(
|       Date.now()-3600*1000
|    )
|  }
| 
| }).explain("executionStats")
{
  explainVersion: '1',
  queryPlanner: {
    namespace: 'iot_platform.sensor_data',
    parsedQuery: {
      '$and': [
        { 'meta.sensorId': { '$eq': 'A1' } },
        { timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') } }
      ]
    },
    indexFilterSet: false,
    queryHash: 'EEF1B671',
    planCacheShapeHash: 'EEF1B671',
    planCacheKey: '889AD66C',
    optimizationTimeMillis: 0,
    maxIndexedOrSolutionsReached: false,
    maxIndexedAndSolutionsReached: false,
    maxScansToExplodeReached: false,
    prunedSimilarIndexes: false,
    winningPlan: {
      isCached: false,
      stage: 'COLLSCAN',
      filter: {
        '$and': [
          { 'meta.sensorId': { '$eq': 'A1' } },
          {
            timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
          }
        ]
      },
      nss: 'iot_platform.sensor_data',
      direction: 'forward'
    },
    rejectedPlans: []
  },
  executionStats: {
    executionSuccess: true,
    nReturned: 0,
    executionTimeMillis: 45,
    totalKeysExamined: 0,
    totalDocsExamined: 100000,
    executionStages: {
      isCached: false,
      stage: 'COLLSCAN',
      filter: {
        '$and': [
          { 'meta.sensorId': { '$eq': 'A1' } },
          {
            timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
          }
        ]
      },
      nReturned: 0,
      executionTimeMillisEstimate: 34,
      works: 100001,
      advanced: 0,
      needTime: 100000,
      needYield: 0,
      saveState: 2,
      restoreState: 2,
      isEOF: 1,
      nss: 'iot_platform.sensor_data',
      direction: 'forward',
      docsExamined: 100000
    }
  },
  queryShapeHash: '2EC1135451188ADCBB5690DB309EFAFABE58E336BC6C5CAEF03C0E7EC65BE9BB',
command: {
    find: 'sensor_data',
    filter: {
      'meta.sensorId': 'A1',
      timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
    },
    '$db': 'iot_platform'
  },
  serverInfo: {
    host: 'rocky9',
    port: 27017,
    version: '8.3.1',
    gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
  },
  serverParameters: {
    internalQueryFacetBufferSizeBytes: 268435456,
    internalDocumentSourceGroupMaxMemoryBytes: 268435456,
    internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
    internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
    internalQueryFacetMaxOutputDocSizeBytes: 104857600,
    internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
    internalQueryProhibitBlockingMergeOnMongoS: 0,
    internalQueryMaxAddToSetBytes: 104857600,
    internalQueryFrameworkControl: 'trySbeRestricted',
    internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
  },
  ok: 1
}

首先在这个collecition 中除了objects_id 是主键他有索引,我们在其他的部分并未建立索引。MongoDB 的时序集合把数据按时间段“打包”存放。查询时,它会先看每个包的“标签”(最大/最小时间)。如果不符合要求,整个包直接跳过,这极大地减少了需要读取的数据量。

在上面的中包含UNPACK_TS_BUCKET,这明显是数据在列式中,存放后,压缩,在读取的时候需要解压缩。还原成单条的记录,以便精确计数。 最后mongodb 利用默认的隐藏时序集合索引,也就是列式中的物理数据是有序的,读取的效率非常高。

而我们优化也是有标准的在实际的情况中,如传感器记录的数据是需要每个穿管器本身的ID 这里就可以在这里建立二级的索引,来解决问题。注意在 metadata 中的每个传感器的名字或标识作为建立索引的字段。

db.sensor_data.createIndex({ "meta.sensor_id": 1, "timestamp": 1 })

unpack
unpack

所以MONGODB 8 的确是适合时序性数据的大量处理,如果有类似需求的项目可以使用MONGODB8.0来解决时序数据存储和聚合的问题。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2026-06-11,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 AustinDatabases 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档