有新的项目可能会大量使用MongoDB,基于最新的8.0 还没有熟悉,尤其我对其中的列式处理特别感兴趣,所以这些知识咱们的跟上,要不开发,架构问你,你还是老的MongoDB知识就坏了,尤其AI当道,架构也会问AI,所以卷的要死。
同时随着操作系统的版本升级,我也想看看LINUX 9 与 MONGODB8 的匹配度是如何的,所以此次就安装一个新的MongoDB 8.3 + LINUX 9 来看看匹配度。
在说安装的事情前,我们先确认一个核心的问题SBE 虽然SBE 在7.0就有了,但是8.0才是爆发的版本, slot based execution 此时大量的查询会默认进入 SBE查询引擎,这样的方式下,和以前的MONGODB的运行有一些区别,IO决定性能,而现在是 CPU+CACHE决定命中。
同时我们要注意到MOGNODB 已经不是原来的MONGODB了,他是集合了 column store ,vector seach ,time series ,aggregation ,change stream 为一体的数据库产品。
那么8.0的确是一个非常值得让DBA去学习的MOGNODB 的版本。
这里同样的在LINUX 中关闭大页内存,设置 vm.max_map_count 关闭 swappiness 修改文件句柄。
相关的命令我们在重复一下
[root@rocky9 bin]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
[root@rocky9 bin]#
[root@rocky9 bin]# sudo grubby --update-kernel=ALL --args="transparent_hugepage=never"
[root@rocky9 bin]#
同时将如下的配置文件写入到 /etc/sysctl.d/mongodb.confd 的文件中
# 增加内存映射区域的最大数量,MongoDB 建议至少 262144
vm.max_map_count = 262144
# 将 swappiness 设为 1,尽量使用物理内存而不触发 swap,但不完全关闭以防 OOM
vm.swappiness = 1
# 调整文件句柄系统级限制(可选,通常 systemd 限制更关键)
fs.file-max = 2097152
sysctl -p /etc/sysctl.d/mongodb.confd
ulimit -n 64000
MongoDB 8.0 的单机参数
storage:
dbPath: /data/mongodb/data
wiredTiger:
engineConfig:
cacheSizeGB: 16
collectionConfig:
blockCompressor: zstd
systemLog:
destination: file
logAppend: true
path: /data/mongodb/log/mongod.log
net:
bindIp: 0.0.0.0
port: 27017
maxIncomingConnections: 65536
processManagement:
fork: true
pidFilePath: /data/mongodb/run/mongod.pid
replication:
replSetName: rs0
oplogSizeMB: 20480
security:
authorization: enabled
keyFile: /data/mongodb/key/mongo.key
setParameter:
flowControlTargetLagSeconds: 10
ttlMonitorSleepSecs: 60
cursorTimeoutMillis: 600000
internalQueryFacetBufferSizeBytes: 268435456
internalDocumentSourceGroupMaxMemoryBytes: 268435456
单机配置文件。
这里就安装一个单机版本,今天准备测试MONGODB8.0的时间序列的核心能力是否如说明书所说,十分的强悍。
MongoDB 7.0 | MongoDB 8.0 |
|---|---|
写入时先存储未压缩数据,再由后台压缩,导致 高 I/O 和缓存压力。 | 写入时直接存储为 列式压缩格式,减少缓存占用,提升写入效率。 |
查询时逐条处理文档,长时间聚合性能较差。 | 引入 块处理(Block Processing),按数据块批量处理,、sort、$group 等操作速度提升可达 **200%+**。 |
高并发场景下容易出现性能波动。 | 更稳定的吞吐量,缓存使用降低 10-20 倍,写入性能提升 2-3 倍。 |
根据文档MongoDB8。0的数据库中时间序列已经不使用BSON的吮吸存储了,而是会同类字段聚合,时间连续数据打包,自动Bucket化,以及使用历史压缩等。
举例 原来的一段温度的数据存储用如下的方式
{temperature:25.1,humidity:40}
{temperature:25.2,humidity:41}
{temperature:25.3,humidity:42}
而现在存储的方式是
{
"temperature": [25.1,25.2,25.3],
"humidity": [40,41,42]
}
这样的好处是,磁盘占用空间降低,IO操作降低,缓存命中率提高,聚合分析更快,以及数据scan更小。
如果要使用时序类型,必须定义timeField 字段的类型必须是BSON日期类型,其实这里指的就是 ISODate 这里要求每条数据的时间戳必须有效,否则无法写入时序集合。
同时我们必须要用 metaField来对时序数据来区分数据源,通过metaField创建后的字段是不能在进行添加活修改的,这就要求使用者有定义这个字段的能力,要求对业务非常熟悉。
而这里的目的也是显而易见的列式存储,要的就是分类,插入一条数据就要如列式的桶,
输入数据的顺序性,输入的数据尽量要有序,不要乱序的嗯写入,虽然MongoDB 能处理但是会增加入桶的开销。
最后一点MONGODB 时序集合,只要按照上述额要求,无需任何额外配置,直接就是列式存储
实例 建立数据库,插入数据
admin> use iot_platform
switched to db iot_platform
iot_platform>
iot_platform> for(let i=0;i<100000;i++){
|
| db.sensor_data.insertOne({
|
| timestamp:new Date(
| Date.now()-(i*1000)
| ),
|
| meta:{
| sensorId:"A"+(i%100),
| location:"factory_"+(i%5),
| deviceType:"iot"
| },
|
| temperature:20+Math.random()*15,
| humidity:30+Math.random()*40,
| pressure:1000+Math.random()*50,
| ai_score:Math.random()
| })
| }
{
acknowledged: true,
insertedId: ObjectId('69fdaca0d52229d0139f7f42')
}
iot_platform> db.sensor_data.countDocuments({
|
| timestamp:{
| $gte:new Date(
| Date.now()-7200*1000
| )
| }
|
| })
1905
iot> db.sensor_data.explain("executionStats").aggregate([
| {
| $match: {
| timestamp: { $gte: new Date(Date.now() - 7200 * 1000) }
| }
| },
| { $count: "total_count" }
| ])
{
explainVersion: '2',
stages: [
{
'$cursor': {
queryPlanner: {
namespace: 'iot.system.buckets.sensor_data',
parsedQuery: {
'$and': [
{ _id: { '$gte': ObjectId('69fdad870000000000000000') } },
{
'control.max.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
}
},
{
'control.min.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
}
}
]
},
indexFilterSet: false,
queryHash: '177847C6',
planCacheShapeHash: '177847C6',
planCacheKey: '15A033E5',
optimizationTimeMillis: 0,
cursorType: 'regular',
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
prunedSimilarIndexes: false,
winningPlan: {
isCached: false,
queryPlan: {
stage: 'GROUP',
planNodeId: 3,
inputStage: {
stage: 'UNPACK_TS_BUCKET',
planNodeId: 2,
include: [ 'timestamp' ],
computedMetaProjFields: [],
includeMeta: false,
eventFilter: {
timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
},
wholeBucketFilter: {
'control.min.timestamp': { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
},
inputStage: {
stage: 'CLUSTERED_IXSCAN',
planNodeId: 1,
filter: {
'$and': [
{
'control.max.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
}
},
{
'control.min.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
}
}
]
},
nss: 'iot.system.buckets.sensor_data',
direction: 'forward',
minRecord: ObjectId('69fdad870000000000000000'),
maxRecord: ObjectId('ffffffffffffffffffffffff')
}
}
},
slotBasedPlan: {
slots: '$$RESULT=s17 env: { s1 = RecordId(6469fdad870000000000000000), s2 = RecordId(64ffffffffffffffffffffffff), s9 = Nothing (nothing) }',
stages: '[3] project [s17 = newBsonObj("_id", s14, "total_count", s16)] \n' +
'[3] project [s16 = (convert ( s15, int32) ?: s15)] \n' +
'[3] block_to_row blocks[s12, s13] row[s14, s15] s8 \n' +
'[3] block_group bitset = s8 [s12] [s13 = valueBlockAggCount(s10)] [s13 = count()] [] [] spillSlots[s11] mergingExprs[sum(s11)] \n' +
'[3] project [s12 = null] \n' +
'[2] filter {!(valueBlockNone(s8, true))} \n' +
'[2] project [s8 = valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7))] \n' +
'[2] ts_bucket_to_cellblock s3 pathReqs[s6 = ProjectPath(Get(timestamp)/Id), s7 = FilterPath(Get(timestamp)/Traverse/Id)] bitmap = s5 \n' +
'[1] filter {(\n' +
' let [\n' +
' l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
'&& \n' +
' let [\n' +
' l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
')} \n' +
'[1] scan s1 = minRecordId, s2 = maxRecordId [s3 = record, s4 = recordId] @"af54ab6f-96c5-49ba-9cdf-c1d05d46f2bb" forward '
}
},
rejectedPlans: []
},
executionStats: {
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 9,
totalKeysExamined: 0,
totalDocsExamined: 5378,
executionStages: {
stage: 'project',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '17': 'newBsonObj("_id", s14, "total_count", s16) ' },
inputStage: {
stage: 'project',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '16': '(convert ( s15, int32) ?: s15) ' },
inputStage: {
stage: 'block_to_row',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
inputStage: {
stage: 'block_group',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
groupBySlots: [ Long('12') ],
blockExpressions: { '13': 'valueBlockAggCount(s10) ' },
rowExpressions: { '13': 'count() ' },
initExprs: { '13': null },
blockDataInSlots: [],
accumulatorDataSlots: [],
mergingExprs: { '11': 'sum(s11) ' },
usedDisk: false,
spills: 0,
spilledBytes: 0,
spilledRecords: 0,
spilledDataStorageSize: 0,
blockAccumulations: 1762,
blockAccumulatorTotalCalls: 1762,
elementWiseAccumulations: 0,
peakTrackedMemBytes: 50,
inputStage: {
stage: 'project',
planNodeId: 3,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 2,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '12': 'null ' },
inputStage: {
stage: 'filter',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 2,
saveState: 2,
restoreState: 1,
isEOF: 1,
numTested: 1762,
filter: '!(valueBlockNone(s8, true)) ',
inputStage: {
stage: 'project',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: {
'8': 'valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7)) '
},
inputStage: {
stage: 'ts_bucket_to_cellblock',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numCellBlocksProduced: 3524,
numStorageBlocks: 1762,
numStorageBlocksDecompressed: 9,
inputStage: {
stage: 'filter',
planNodeId: 1,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numTested: 5378,
filter: '(\n' +
' let [\n' +
' l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
'&& \n' +
' let [\n' +
' l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
') ',
inputStage: {
stage: 'scan',
planNodeId: 1,
nReturned: 5378,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numReads: 5378,
recordSlot: 3,
recordIdSlot: 4,
scanFieldNames: [],
scanFieldSlots: [],
minRecordIdSlot: 1,
maxRecordIdSlot: 2
}
}
}
}
}
}
}
}
}
}
}
},
nReturned: Long('1'),
executionTimeMillisEstimate: Long('1')
},
{
'$project': { total_count: true, _id: false },
nReturned: Long('1'),
executionTimeMillisEstimate: Long('1')
}
],
queryShapeHash: 'B170065DFBDD48DD141D52B5A33EE73471B06CC9FC6FE4F85B8FC0EA150664BB',
peakTrackedMemBytes: Long('50'),
serverInfo: {
host: 'rocky9',
port: 27017,
version: '8.3.1',
gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 268435456,
internalDocumentSourceGroupMaxMemoryBytes: 268435456,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalQueryFrameworkControl: 'trySbeRestricted',
internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
},
command: {
aggregate: 'system.buckets.sensor_data',
pipeline: [
{
'$_internalUnpackBucket': {
timeField: 'timestamp',
metaField: 'meta',
bucketMaxSpanSeconds: 3600,
assumeNoMixedSchemaData: true,
usesExtendedRange: false,
fixedBuckets: false
}
},
{
'$match': { timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') } }
},
{ '$count': 'total_count' }
],
cursor: {},
collation: { locale: 'simple' }
},
ok: 1
}
iot>
iot> db.sensor_data.explain("executionStats").aggregate([
| {
| $match: {
| timestamp: { $gte: new Date(Date.now() - 7200 * 1000) }
| }
| },
| { $count: "total_count" }
| ])
{
explainVersion: '2',
stages: [
{
'$cursor': {
queryPlanner: {
namespace: 'iot.system.buckets.sensor_data',
parsedQuery: {
'$and': [
{ _id: { '$gte': ObjectId('69fdad870000000000000000') } },
{
'control.max.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
}
},
{
'control.min.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
}
}
]
},
indexFilterSet: false,
queryHash: '177847C6',
planCacheShapeHash: '177847C6',
planCacheKey: '15A033E5',
optimizationTimeMillis: 0,
cursorType: 'regular',
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
prunedSimilarIndexes: false,
winningPlan: {
isCached: false,
queryPlan: {
stage: 'GROUP',
planNodeId: 3,
inputStage: {
stage: 'UNPACK_TS_BUCKET',
planNodeId: 2,
include: [ 'timestamp' ],
computedMetaProjFields: [],
includeMeta: false,
eventFilter: {
timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
},
wholeBucketFilter: {
'control.min.timestamp': { '$gte': ISODate('2026-05-08T10:31:51.124Z') }
},
inputStage: {
stage: 'CLUSTERED_IXSCAN',
planNodeId: 1,
filter: {
'$and': [
{
'control.max.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T10:31:51.124Z')
}
},
{
'control.min.timestamp': {
'$_internalExprGte': ISODate('2026-05-08T09:31:51.124Z')
}
}
]
},
nss: 'iot.system.buckets.sensor_data',
direction: 'forward',
minRecord: ObjectId('69fdad870000000000000000'),
maxRecord: ObjectId('ffffffffffffffffffffffff')
}
}
},
slotBasedPlan: {
slots: '$$RESULT=s17 env: { s1 = RecordId(6469fdad870000000000000000), s2 = RecordId(64ffffffffffffffffffffffff), s9 = Nothing (nothing) }',
stages: '[3] project [s17 = newBsonObj("_id", s14, "total_count", s16)] \n' +
'[3] project [s16 = (convert ( s15, int32) ?: s15)] \n' +
'[3] block_to_row blocks[s12, s13] row[s14, s15] s8 \n' +
'[3] block_group bitset = s8 [s12] [s13 = valueBlockAggCount(s10)] [s13 = count()] [] [] spillSlots[s11] mergingExprs[sum(s11)] \n' +
'[3] project [s12 = null] \n' +
'[2] filter {!(valueBlockNone(s8, true))} \n' +
'[2] project [s8 = valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7))] \n' +
'[2] ts_bucket_to_cellblock s3 pathReqs[s6 = ProjectPath(Get(timestamp)/Id), s7 = FilterPath(Get(timestamp)/Traverse/Id)] bitmap = s5 \n' +
'[1] filter {(\n' +
' let [\n' +
' l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
'&& \n' +
' let [\n' +
' l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
')} \n' +
'[1] scan s1 = minRecordId, s2 = maxRecordId [s3 = record, s4 = recordId] @"af54ab6f-96c5-49ba-9cdf-c1d05d46f2bb" forward '
}
},
rejectedPlans: []
},
executionStats: {
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 9,
totalKeysExamined: 0,
totalDocsExamined: 5378,
executionStages: {
stage: 'project',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '17': 'newBsonObj("_id", s14, "total_count", s16) ' },
inputStage: {
stage: 'project',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '16': '(convert ( s15, int32) ?: s15) ' },
inputStage: {
stage: 'block_to_row',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
inputStage: {
stage: 'block_group',
planNodeId: 3,
nReturned: 1,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
groupBySlots: [ Long('12') ],
blockExpressions: { '13': 'valueBlockAggCount(s10) ' },
rowExpressions: { '13': 'count() ' },
initExprs: { '13': null },
blockDataInSlots: [],
accumulatorDataSlots: [],
mergingExprs: { '11': 'sum(s11) ' },
usedDisk: false,
spills: 0,
spilledBytes: 0,
spilledRecords: 0,
spilledDataStorageSize: 0,
blockAccumulations: 1762,
blockAccumulatorTotalCalls: 1762,
elementWiseAccumulations: 0,
peakTrackedMemBytes: 50,
inputStage: {
stage: 'project',
planNodeId: 3,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 2,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: { '12': 'null ' },
inputStage: {
stage: 'filter',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 2,
saveState: 2,
restoreState: 1,
isEOF: 1,
numTested: 1762,
filter: '!(valueBlockNone(s8, true)) ',
inputStage: {
stage: 'project',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
projections: {
'8': 'valueBlockLogicalAnd(s5, cellFoldValues_F(valueBlockFillEmpty(valueBlockGteScalar(cellBlockGetFlatValuesBlock(s7), Date(1778236311124)), false), s7)) '
},
inputStage: {
stage: 'ts_bucket_to_cellblock',
planNodeId: 2,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numCellBlocksProduced: 3524,
numStorageBlocks: 1762,
numStorageBlocksDecompressed: 9,
inputStage: {
stage: 'filter',
planNodeId: 1,
nReturned: 1762,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numTested: 5378,
filter: '(\n' +
' let [\n' +
' l9.0 = traverseP(getField(s3, "control"), lambda(l10.0) { traverseP(getField(move(l10.0), "max"), lambda(l11.0) { getField(move(l11.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l9.0) ?: false) || (((l9.0 <=> Date(1778236311124)) >= 0) ?: ((exists(l9.0) && typeMatch(l9.0, -65)) >= true))) \n' +
'&& \n' +
' let [\n' +
' l12.0 = traverseP(getField(s3, "control"), lambda(l13.0) { traverseP(getField(move(l13.0), "min"), lambda(l14.0) { getField(move(l14.0), "timestamp") }, 1) }, 1) \n' +
' ] \n' +
' in ((isArray(l12.0) ?: false) || (((l12.0 <=> Date(1778232711124)) >= 0) ?: ((exists(l12.0) && typeMatch(l12.0, -65)) >= true))) \n' +
') ',
inputStage: {
stage: 'scan',
planNodeId: 1,
nReturned: 5378,
executionTimeMillisEstimate: 0,
opens: 1,
closes: 1,
saveState: 2,
restoreState: 1,
isEOF: 1,
numReads: 5378,
recordSlot: 3,
recordIdSlot: 4,
scanFieldNames: [],
scanFieldSlots: [],
minRecordIdSlot: 1,
maxRecordIdSlot: 2
}
}
}
}
}
}
}
}
}
}
}
},
nReturned: Long('1'),
executionTimeMillisEstimate: Long('1')
},
{
'$project': { total_count: true, _id: false },
nReturned: Long('1'),
executionTimeMillisEstimate: Long('1')
}
],
queryShapeHash: 'B170065DFBDD48DD141D52B5A33EE73471B06CC9FC6FE4F85B8FC0EA150664BB',
peakTrackedMemBytes: Long('50'),
serverInfo: {
host: 'rocky9',
port: 27017,
version: '8.3.1',
gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 268435456,
internalDocumentSourceGroupMaxMemoryBytes: 268435456,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalQueryFrameworkControl: 'trySbeRestricted',
internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
},
command: {
aggregate: 'system.buckets.sensor_data',
pipeline: [
{
'$_internalUnpackBucket': {
timeField: 'timestamp',
metaField: 'meta',
bucketMaxSpanSeconds: 3600,
assumeNoMixedSchemaData: true,
usesExtendedRange: false,
fixedBuckets: false
}
},
{
'$match': { timestamp: { '$gte': ISODate('2026-05-08T10:31:51.124Z') } }
},
{ '$count': 'total_count' }
],
cursor: {},
collation: { locale: 'simple' }
},
ok: 1
}
iot>
iot_platform>
下面我们看看这个部分的语句的执行计划
iot_platform> db.sensor_data.find({
|
| "meta.sensorId":"A1",
|
| timestamp:{
| $gte:new Date(
| Date.now()-3600*1000
| )
| }
|
| }).explain("executionStats")
{
explainVersion: '1',
queryPlanner: {
namespace: 'iot_platform.sensor_data',
parsedQuery: {
'$and': [
{ 'meta.sensorId': { '$eq': 'A1' } },
{ timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') } }
]
},
indexFilterSet: false,
queryHash: 'EEF1B671',
planCacheShapeHash: 'EEF1B671',
planCacheKey: '889AD66C',
optimizationTimeMillis: 0,
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
prunedSimilarIndexes: false,
winningPlan: {
isCached: false,
stage: 'COLLSCAN',
filter: {
'$and': [
{ 'meta.sensorId': { '$eq': 'A1' } },
{
timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
}
]
},
nss: 'iot_platform.sensor_data',
direction: 'forward'
},
rejectedPlans: []
},
executionStats: {
executionSuccess: true,
nReturned: 0,
executionTimeMillis: 45,
totalKeysExamined: 0,
totalDocsExamined: 100000,
executionStages: {
isCached: false,
stage: 'COLLSCAN',
filter: {
'$and': [
{ 'meta.sensorId': { '$eq': 'A1' } },
{
timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
}
]
},
nReturned: 0,
executionTimeMillisEstimate: 34,
works: 100001,
advanced: 0,
needTime: 100000,
needYield: 0,
saveState: 2,
restoreState: 2,
isEOF: 1,
nss: 'iot_platform.sensor_data',
direction: 'forward',
docsExamined: 100000
}
},
queryShapeHash: '2EC1135451188ADCBB5690DB309EFAFABE58E336BC6C5CAEF03C0E7EC65BE9BB',
command: {
find: 'sensor_data',
filter: {
'meta.sensorId': 'A1',
timestamp: { '$gte': ISODate('2026-05-08T09:57:05.197Z') }
},
'$db': 'iot_platform'
},
serverInfo: {
host: 'rocky9',
port: 27017,
version: '8.3.1',
gitVersion: '094e631246d5b5bee5bd5f20b12f882bc8e286ea'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 268435456,
internalDocumentSourceGroupMaxMemoryBytes: 268435456,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalQueryFrameworkControl: 'trySbeRestricted',
internalQueryPlannerIgnoreIndexWithCollationForRegex: 1
},
ok: 1
}
首先在这个collecition 中除了objects_id 是主键他有索引,我们在其他的部分并未建立索引。MongoDB 的时序集合把数据按时间段“打包”存放。查询时,它会先看每个包的“标签”(最大/最小时间)。如果不符合要求,整个包直接跳过,这极大地减少了需要读取的数据量。
在上面的中包含UNPACK_TS_BUCKET,这明显是数据在列式中,存放后,压缩,在读取的时候需要解压缩。还原成单条的记录,以便精确计数。 最后mongodb 利用默认的隐藏时序集合索引,也就是列式中的物理数据是有序的,读取的效率非常高。
而我们优化也是有标准的在实际的情况中,如传感器记录的数据是需要每个穿管器本身的ID 这里就可以在这里建立二级的索引,来解决问题。注意在 metadata 中的每个传感器的名字或标识作为建立索引的字段。
db.sensor_data.createIndex({ "meta.sensor_id": 1, "timestamp": 1 })

所以MONGODB 8 的确是适合时序性数据的大量处理,如果有类似需求的项目可以使用MONGODB8.0来解决时序数据存储和聚合的问题。
本文分享自 AustinDatabases 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!