命名空间
Namespace = QCE/TSTREAM
监控指标
指标英文名 | 指标中文名 | 说明 | 单位 | 维度 | 统计规则
[period, statType] |
Binlogpos | 位点信息 | 数据源日志位点信息 | None | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
Currentemiteventtimelag | source 消息处理时间 | CDC Source 处理发送下游的时间与消息本身时间差 | ms | tjob_id | [ 60s, last ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
Currentfetcheventtimelag | 数据流入耗时 | Source Fetch 消息的延迟时间(EmitTime-messageTimestamp) | ms | tjob_id | [ 60s, last ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
Dbflushdelay | Sink 刷新延迟 | Sink 刷新延迟 | ms | tjob_id | [ 60s, last ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobBytesInPerSecond | 作业每秒输入的数据量 | 作业所有数据源(Source)每秒输入的数据总量(仅对 Kafka Source 有效) | Bytes/s | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobBytesOutPerSecond | 作业每秒输出的数据量 | 作业所有数据目的(Sink)每秒输出的数据总量(仅对 Kafka Sink 有效) | Bytes/s | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobCpuLoad | TaskManagerCPU 使用率 | 作业中所有 TaskManager 的平均 CPU 使用率 | % | tjob_id | [ 60s, avg ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobLastcheckpointduration | 最近一次的 Checkpoint 耗时 | 当前作业最近一次的 Checkpoint 耗时 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobLastcheckpointsize | 最近一次的 Checkpoint 大小 | 当前作业最近一次的 Checkpoint 大小 | Bytes | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobLatency | 算子计算总耗时 | 数据流经各个算子时的耗时总和。可能存在采样误差,数值仅供参考 | ms | tjob_id | [ 60s, sum ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerCpuLoad | JobManagerCPU 的使用率 | 当前作业 JobManager 的 CPU 使用率 | % | tjob_id | [ 60s, avg ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerCpuTime | JobManager 使用 CPU 的时长 | 当前作业 JobManager CPU 使用时长(毫秒) | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerDowntime | 作业中断运行时间 | 对于失败或恢复等非运行状态的作业,表示本次中断运行的时长。对于正在运行中的作业,值为0 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerJobNumrestarts | 当前运行实例的重启次数 | 当前实例 JobManager 记录的任务崩溃重启次数(不含 JobManager 退出后作业重新拉起的场景) | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerJvmOldGcCount | JobManager 老年代 GC 次数 | 当前作业 JobManager 老年代 GC 次数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerJvmOldGcTime | JobManager 老年代 GC 时间 | 当前作业 JobManager 老年代 GC 时间 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerJvmYoungGcCount | JobManager 年轻代 GC 次数 | 当前作业 JobManager 年轻代 GC 次数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerJvmYoungGcTime | JobManager 年轻代 GC 时间 | 当前作业 JobManager 年轻代 GC 时间 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerLastcheckpointrestoretimestamp | 作业上次从快照恢复的时间戳 | 作业最近一次从快照恢复的 Unix 时间戳(以毫秒为单位,如果未恢复过则是 -1) | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerMemoryDirectCount | JobManager 直接内存中的缓存数 | JobManager 堆外直接内存(Direct Buffer Pool)中的缓存(Buffer)个数 | Count | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryDirectMemoryused | JobManager 堆外直接内存用量 | JobManager 堆外直接内存(Direct Buffer Pool)的用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryDirectTotalcapacity | JobManager 直接内存总容量 | JobManager 堆外直接内存(Direct Buffer Pool)的最大用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryHeapCommitted | JobManager 已提交的堆内存容量 | 当前作业 JobManager 已提交(committed)的堆内存容量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryHeapMax | JobManager 堆内存最大容量 | 当前作业 JobManager 堆内存最大容量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryMappedCount | JobManager 映射内存的缓存数 | JobManager 堆外映射内存(Mapped Buffer Pool)中的缓存(Buffer)个数之和 | Count | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryMappedMemoryused | JobManager 映射内存的使用量 | JobManager 堆外映射内存(Mapped Buffer Pool)的用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryMappedTotalcapacity | JobManager 映射内存的总容量 | JobManager 堆外映射内存(Mapped Buffer Pool)的最大用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerMemoryNonheapCommitted | JobManager 已提交的非堆内存容量 | 当前作业已提交(committed)的 JobManager 非堆内存(JVM 元空间、代码缓存等)容量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerNumberofinprogresscheckpoints | 进行中的 Checkpoint 数量 | 当前作业进行中(未完成)的 Checkpoint 个数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerNumregisteredtaskmanagers | 作业中 TaskManager 的数量 | 当前作业已注册的 TaskManager 数,通常等于所有算子并行度的最大值。如果 TaskManager 个数减少,说明存在 TaskManager 失联,作业可能崩溃并尝试恢复。 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerNumrunningjobs | 运行中的作业数量 | 正在运行中作业数。如果作业正常运行,则值为1;如果作业崩溃则值为0 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerStatusJvmMemoryHeapUsed | JobManager 堆内存的用量 | 当前作业 JobManager 堆内存的用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerStatusJvmMemoryHeapUsedPercentage | JobManager 堆内存使用率 | 当前作业 JobManager 堆内存使用率 | % | tjob_id | [ 60s, expr ]
[ 300s, expr ] |
JobmanagerStatusJvmMemoryNonheapMax | JobManager 内存最大容量 | 当前作业 JobManager 非堆内存(JVM 元空间、代码缓存等)的最大容量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerStatusJvmMemoryNonheapUsed | JobManager 非堆内存用量 | 当前作业 JobManager 非堆内存(JVM 元空间、代码缓存等)用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerStatusJvmMemoryProcessMemoryused | JobManager 所在的 JVM 的物理内存用量 | JobManager 所在的 JVM 的物理内存用量 | Bytes | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobmanagerTaskslotsavailable | JobManager 可用任务槽数量 | 如果作业正常运行,则可用的任务槽(Task Slot)数为0;如果不为0 则说明作业可能出现短时间的非运行状态 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerTaskslotstotal | JobManager 任务槽总数 | Oceanus 中一个 TaskManager 只有一个任务槽,因此任务槽总数等于注册的 TaskManager 数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobmanagerThreadsCount | JobManager 活动线程数 | 当前作业 JobManager 中活动的线程数,含 Daemon 和非 Daemon 线程。 | Count | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobMemoryDirectUsed | TaskManager 堆外直接内存用量 | 作业中所有 TaskManager 堆外直接内存(Direct Buffer Pool)的用量之和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobMemoryHeapMax | TaskManager 堆内存最大容量 | 作业中所有 TaskManager 的堆内存最大(max)容量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobMemoryHeapUsed | TaskManager 堆内存用量 | 作业中所有 TaskManager 的当前堆内存用量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobMemoryMappedUsed | TaskManager 堆外映射内存用量 | 作业中所有 TaskManager 堆外映射内存(Mapped Buffer Pool)的用量之和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobMemoryNonheapUsed | TaskManager 非堆内存用量 | 作业中所有 TaskManager 非堆内存(JVM 元空间、代码缓存等)用量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobNumberofcompletedcheckpoints | Checkpoint 成功完成次数 | 当前作业 Checkpoint 成功完成的次数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobNumberoffailedcheckpoints | Checkpoint 失败次数 | 当前作业 Checkpoint 失败(例如超时、遇到异常等)的次数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobNumrecordsinbutfailed | 严重异常数据个数 | 算子中发生严重异常(例如抛出各种 Exception)的数据个数,如果大于 1 则会影响 Exactly-Once 语义(试验参数,仅供参考) | Count | tjob_id | [ 60s, sum ]
[ 300s, sum ]
[ 3600s, sum ]
[ 86400s, sum ] |
JobRecordsInPerSecond | 作业每秒输入的记录条数 | 作业所有数据源(Source)每秒输入的数据总条数 | Count/s | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobRecordsOutPerSecond | 作业每秒输出的记录条数 | 作业所有数据目的(Sink)每秒输出的数据总条数 | Count/s | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
JobRestartingtime | 作业重启耗时 | 作业最近一次重启耗时 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobServiceDelay | 目的端 Watermark 延时 | 当前时间戳与数据目的(Sink)InputWatermark 之间的差值(多个 Sink 则取最大值) | ms | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
JobTotalnumberofcheckpoints | Checkpoint 总次数 | Checkpoint 总次数(进行中、已完成和失败的总和) | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
JobUptime | 作业无中断持续执行的时间 | 对于运行中的作业,表示当次作业持续处于运行中的时长 | ms | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
RecordsLagMax | Taskmanager 上报的 kafka 最大 lag 指标 | TaskManager 上报的 Kafka 最大 lag 指标 | None | tjob_id | [ 60s, max ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
RecordsLagMaxAvg | TaskManager 上报的 kafka 最大 lag 指标的均值 | TaskManager 上报的 Kafka 最大 lag 指标的均值 | None | tjob_id | [ 60s, avg ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
RecordsLagMaxMin | TaskManager 上报的 kafka 最大 lag 指标最小值 | TaskManager 上报的 Kafka 最大 lag 指标最小值 | None | tjob_id | [ 60s, min ]
[ 300s, min ]
[ 3600s, min ]
[ 86400s, min ] |
RecordsLagMaxSum | TaskManager 上报的 kafka 最大 lag 指标的求和值 | TaskManager 上报的 Kafka 最大 lag 指标的求和值 | None | tjob_id | [ 60s, sum ]
[ 300s, sum ]
[ 3600s, sum ]
[ 86400s, sum ] |
Sourceidletime | 批间隔时间 | Source 处理的空闲时间 | ms | tjob_id | [ 60s, last ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
Syndelay | 数据源同步百分比 | 数据源同步百分比 | % | tjob_id | [ 60s, last ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerCpuTime | TaskManager 使用 CPU 的时长 | 作业中所有 TaskManager CPU 使用时长总和(毫秒) | ms | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerJobTaskBackpressuredtimemspersecond | 反压指标 | 当前作业所有 task 每秒内反压时长的最大值 | % | tjob_id | [ 60s, max ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerJobTaskBuffersInpoolusage | task 输入缓冲区使用率 | task 输入缓冲区使用率 | % | tjob_id
task_id
subtask_index | [ 60s, max ]
[ 300s, max ] |
TaskmanagerJobTaskBuffersInputqueuelength | task 输入缓冲区的数量 | task 输入缓冲区的数量 | None | tjob_id
task_id
subtask_index | [ 60s, max ]
[ 300s, max ] |
TaskmanagerJobTaskBuffersOutpoolusage | task 输出缓冲区使用率 | task 输出缓冲区使用率 | % | tjob_id
task_id
subtask_index | [ 60s, max ]
[ 300s, max ] |
TaskmanagerJobTaskBuffersOutputqueuelength | task 输出缓冲区的数量 | task 输出缓冲区的数量 | None | tjob_id
task_id
subtask_index | [ 60s, max ]
[ 300s, max ] |
TaskmanagerJobTaskCurrentlowwatermark | task 当前收到的 watermark | task 当前收到的 watermark | None | tjob_id
task_id
subtask_index | [ 60s, max ]
[ 300s, max ] |
TaskmanagerJobTaskDataskewcoefficient | 数据倾斜程度 | 当前作业所有 task 的 subtask 的数据输入量的离散系数(=标准差/均值),在[0,1]的范围内 | None | tjob_id | [ 60s, max ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerJobTaskOperatorKafkaSwitch | 开关指标_kafka | 是否包含 Kafka connector | None | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerJobTaskOperatorSchemachange | CDC_Schema_变更次数 | CDC Schema 变更次数 | Count | tjob_id | [ 60s, last ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerJvmOldGcCount | TaskManager 老年代 GC 次数 | 作业中所有 TaskManager 老年代 GC 次数之和 | Count | tjob_id | [ 60s, sum ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerJvmOldGcTime | TaskManager 老年代 GC 时间 | 作业中所有 TaskManager 老年代 GC 时间之和 | ms | tjob_id | [ 60s, sum ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerJvmThreadsCount | TaskManager 活动线程数 | 作业中所有 TaskManager 中活动的线程数之和,含 Daemon 和非 Daemon 线程 | Count | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerJvmYoungGcCount | TaskManager 年轻代 GC 次数 | 作业中所有 TaskManager 年轻代 GC 次数之和 | Count | tjob_id | [ 60s, sum ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerJvmYoungGcTime | TaskManager 年轻代 GC 时间 | 作业中所有 TaskManager 年轻代 GC 时间之和 | ms | tjob_id | [ 60s, sum ]
[ 300s, last ]
[ 3600s, last ]
[ 86400s, last ] |
TaskmanagerMemoryDirectCount | TaskManager 直接内存缓存数 | 作业中所有 TaskManager 堆外直接内存(Direct Buffer Pool)中的缓存(Buffer)个数之和 | Count | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerMemoryDirectTotalcapacity | TaskManager 直接内存总容量 | 作业中所有 TaskManager 堆外直接内存(Direct Buffer Pool)的最大容量之和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerMemoryHeapCommitted | TaskManager 已提交的堆内存容量 | 作业中所有 TaskManager 已提交(committed)的堆内存容量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerMemoryMappedCount | TaskManager 堆外映射内存缓存数 | 作业中所有 TaskManager 堆外映射内存(Mapped Buffer Pool)中的缓存(Buffer)个数之和 | Count | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerMemoryMappedTotalcapacity | TaskManager 堆外映射内存总容量 | 作业中所有 TaskManager 堆外映射内存(Mapped Buffer Pool)的最大容量之和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerMemoryNonheapCommitted | TaskManager 已提交的非堆内存容量 | 作业中所有 TaskManager 已提交(committed)的非堆内存(JVM 元空间、代码缓存等)用量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerNetworkAvailablememorysegments | 当前可用的 MemorySegment 个数 | 作业中所有 TaskManager 的可用 MemorySegment 个数之和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerNetworkTotalmemorysegments | 已分配的 MemorySegment 总数 | 作业中所有 TaskManager 已分配的 MemorySegment 个数总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerStatusJvmCpuLoad | 单个 TaskManager 的 jvm 的 CPU 负载 | 单个 TaskManager 的 jvm CPU 负载 | % | tjob_id
tm_id | [ 60s, avg ]
[ 300s, avg ] |
TaskmanagerStatusJvmGarbagecollectorOldGenerationCount | 单个 TaskManager 老年代 GC 次数 | 单个 TaskManager 老年代 GC 次数 | None | tm_id
tjob_id | [ 60s, max ]
[ 300s, max ] |
TaskmanagerStatusJvmGarbagecollectorOldGenerationTime | 单个 TaskManager 老年代 GC 时间 | 单个 TaskManager 老年代 GC 时间 | ms | tm_id
tjob_id | [ 60s, max ]
[ 300s, max ] |
TaskmanagerStatusJvmMemoryHeapUsed | 单个 TaskManager 堆内存使用量 | 单个 TaskManager 堆内存使用量 | Bytes | tjob_id
tm_id | [ 60s, max ]
[ 300s, max ] |
TaskmanagerStatusJvmMemoryHeapUsedPercentage | TaskManager 堆内存使用率 | 作业中所有 TaskManager 的平均堆内存使用率 | % | tjob_id | [ 60s, expr ]
[ 300s, expr ] |
TaskmanagerStatusJvmMemoryNonheapMax | TaskManager 非堆内存最大容量 | 作业中所有 TaskManager 非堆内存(JVM 元空间、代码缓存等)最大容量总和 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, avg ]
[ 3600s, avg ]
[ 86400s, avg ] |
TaskmanagerStatusJvmMemoryProcessMemoryused | 所有 TaskManager 所在 JVM 的物理内存的用量的最大值 | 所有 TaskManager JVM 的物理内存用量的最大值 | Bytes | tjob_id | [ 60s, sum ]
[ 300s, max ]
[ 3600s, max ]
[ 86400s, max ] |
ClusterTotalSnapshotCount | Snapshot 总次数 | 集群 Snapshot 总次数 | Count | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterTotalCommitCount | 集群 Commit 总次数 | 集群 Commit 总次数 | Count | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterTotalCommitSuccessCount | 集群 Commit 成功总次数 | 集群 Commit 成功总次数 | Count | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterTotalCommitFailedCount | 集群 Commit 失败总次数 | 集群 Commit 失败总次数 | Count | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterTotalRestartCount | 集群异常重启总次数 | 集群 Worker 重启总次数 | Count | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterCurActivePodNum | 当前健康 Pod 数量 | 集群当前活跃 Pod 数量 | None | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterStorageTotalUsed | 集群当前已使用存储总量 | 集群当前已使用存储总量 | Bytes | cluster_id | [ 60s, last ] [ 300s, last ] |
ClusterStorageTotalUsedPercent | 集群当前已使用存储率 | 集群当前已使用存储率 | % | cluster_id | [ 60s, last ] [ 300s, last ] |
TableCommitSuccessCount | 表 Commit 成功次数 | 表的提交成功次数 | Count | cluster_id database_name table_name | [ 60s, last ] [ 300s, last ] |
TableCommitFailedCount | 表 Commit 失败次数 | 表的提交失败次数 | Count | cluster_id database_name table_name | [ 60s, last ] [ 300s, last ] |
TableSizeBytes | 表大小 | 表所占存储大小 | Bytes | cluster_id database_name table_name | [ 60s, last ] [ 300s, last ] |
TableMetaSizeBytes | 表元数据大小 | 表所占元数据大小 | Bytes | cluster_id database_name table_name | [ 60s, last ] [ 300s, last ] |
TableTotalOpenedBucketCount | 表打开的 bucket 数量 | 表当前所有已打开的 bucket 数量 | None | cluster_id database_name table_name | [ 60s, last ] [ 300s, last ] |
各维度对应参数总览
参数名称 | 维度名称 | 维度解释 | 格式 |
Instances.N.Dimensions.0.Name | tjob_id | 作业 ID 的维度名称 | 输入 String 类型维度名称:tjob_id |
Instances.N.Dimensions.0.Value | tjob_id | 具体作业 ID | |
Instances.N.Dimensions.0.Name | task_id | Flink 任务的算子 ID | 输入 String 类型维度名称:task_id |
Instances.N.Dimensions.0.Value | task_id | 具体算子 ID | |
Instances.N.Dimensions.0.Name | subtask_index | Flink 中用于标识并行任务中每个子任务的索引 | 输入 String 类型维度名称:subtask_index |
Instances.N.Dimensions.0.Value | subtask_index | 具体子任务的索引 | |
Instances.N.Dimensions.0.Name | tm_id | Flink 中的 taskmanager pod 唯一的名称 | 输入 String 类型维度名称:tm_id |
Instances.N.Dimensions.0.Value | tm_id | 具体 pod 名称 | 输入具体 pod 名称,可以从 DescribeJobRuntimeInfo 接口中获取 JobRuntimeInfo.[N].Key 为 TaskManagers 的 Value,使用 base64解码 |
Instances.N.Dimensions.0.Name | cluster_id | 集群 ID 的维度名称 | 输入 String 类型维度名称:cluster_id |
Instances.N.Dimensions.0.Value | cluster_id | 具体集群 ID | |
Instances.N.Dimensions.0.Name | database_name | database_name 的维度名称 | 输入 String 类型维度名称:database_name |
Instances.N.Dimensions.0.Value | database_name | 具体的数据库名称 | Setats 数据库名称,例如:default |
Instances.N.Dimensions.0.Name | table_name | table_name 的维度名称 | 输入 String 类型维度名称:table_name |
Instances.N.Dimensions.0.Value | table_name | 具体的数据表名称 | Setats 表名称,例如:test_table |
入参说明
查询云服务器监控数据,入参取值如下:
&Namespace=QCE/TSTREAM
&Instances.N.Dimensions.0.Name=tjob_id
&Instances.N.Dimensions.0.Value=具体作业 ID
&Instances.N.Dimensions.0.Name=task_id
&Instances.N.Dimensions.0.Value=具体算子 ID
&Instances.N.Dimensions.0.Name=subtask_index
&Instances.N.Dimensions.0.Value=具体子任务的索引
&Instances.N.Dimensions.0.Name=tm_id
&Instances.N.Dimensions.0.Value=具体 pod 名称
&Instances.N.Dimensions.0.Name=cluster_id
&Instances.N.Dimensions.0.Value=具体 ClusterId 字段
&Instances.N.Dimensions.0.Name=database_name
&Instances.N.Dimensions.0.Value=Setats 数据库名称
&Instances.N.Dimensions.0.Name=table_name
&Instances.N.Dimensions.0.Value=Setats 表名称