命名空间
Namespace = QCE/TI_MODEL
监控指标
指标英文名 | 指标中文名 | 说明 | 单位 | 维度 | 统计规则 [period, statType] |
Apicallerrortotal | 接口失败调用量 | 接口失败调用量 | Count | Source SubUin ServiceGroupId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
Apicalllimittotal | 被限制请求总数 | 接口调用被限制总量 | Count | Source SubUin ServiceGroupId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
Apicallsuccesstotal | 调用成功总量 | 接口调用成功总量 | Count | SubUin ServiceGroupId Source | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
Apicalltotal | 接口调用总量 | 接口调用总量 | Count | Source SubUin ServiceGroupId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
Apiresponsetime | 平均响应时间 | 平均响应时间 | ms | ServiceGroupId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsClientDataReadBandwidth | turocfs 单节点服务端读带宽 | turocfs 单节点服务端读带宽 | KBytes/s | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsClientDataWriteBandwidth | turocfs 单节点服务端写带宽 | turocfs 单节点服务端写带宽 | KBytes/s | Source SubUin InstanceId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsDataReadIoBytes | cfs 服务端读带宽 | cfs 服务端读带宽 | KBytes/s | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsDataReadIoLatency | cfs 读延迟 | cfs 读延迟 | ms | Source SubUin InstanceId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsDataWriteIoBytes | cfs 服务端写带宽 | cfs 服务端写带宽 | KBytes/s | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsDataWriteIoLatency | cfs 写延迟 | cfs 写延迟 | ms | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
CfsStrageUsageGb | cfs 存储数据容量 | cfs 存储数据容量 | GBytes | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Cpuutil | CPU 利用率 | CPU 利用率 | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskIoUtil | 磁盘 ioutil | 磁盘 ioutil | % | Source SubUin InstanceId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskIoWait | 磁盘 iowait | 磁盘 iowait | % | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskReadByte | 磁盘读取带宽 | 磁盘读取带宽 | MBytes/s | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskReadIops | 磁盘读取 iops | 磁盘读取 iops | Count | SubUin InstanceId Source | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskUsageRadio | 系统盘分区利用率 | 系统盘分区利用率 | % | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskWriteByte | 磁盘写入带宽 | 磁盘写入带宽 | MBytes/s | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DiskWriteIops | 磁盘写入 iops | 磁盘写入 iops | Count | Source SubUin InstanceId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Gpumemutil | GPU 显存利用率 | GPU 显存利用率 | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Gpuutil | GPU 利用率 | GPU 利用率 | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancecpuutil | CPU 利用率 | CPU 利用率 | % | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancegpumemutil | GPU 显存利用率 | GPU 显存利用率 | % | SubUin InstanceId Source | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancegpuutil | GPU 利用率 | GPU 利用率 | % | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancehttpqps | http 调用 qps | 实例的 http 每秒请求数 | Count/s | InstanceId Source SubUin | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
Instancehttpqpslimit | http 调用被限制 qps | 实例的 http 每秒被限制请求数 | Count/s | Source SubUin InstanceId | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
Instancememutil | 内存利用率 | 内存利用率 | % | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancememvalue | 内存使用量 | 内存使用量 | MBytes | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instancenetworkibytes | 网络入流量 | 网络入流量 | MBytes | InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Instanceready | 实例运行数量 | 实例运行数量 | Count | AppId Source SubUin TaskId | [ 10s, last ] [ 60s, last ] [ 300s, last ] [ 3600s, last ] [ 86400s, last ] |
InstanceTiemsCurrentRequests | 并发请求数 | 并发请求数 | Count | InstanceId Source SubUin | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
Instancetotal | 实例数量 | 实例数量 | Count | AppId Source SubUin TaskId | [ 10s, last ] [ 60s, last ] [ 300s, last ] [ 3600s, last ] [ 86400s, last ] |
Memutil | 内存利用率 | 内存利用率 | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Memvalue | 内存用量 | 内存用量 | MBytes | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Networkreceivebytes | 网络入流量 | 网络入流量 | MBytes | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsClientDataReadBandwidth | turocfs 单节点服务端读带宽 | turocfs 单节点服务端读带宽 | KBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsClientDataWriteBandwidth | turocfs 单节点服务端写带宽 | turocfs 单节点服务端写带宽 | KBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsDataReadIoBytes | cfs 服务端读带宽 | cfs 服务端读带宽 | KBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsDataReadIoLatency | cfs 读延迟 | cfs 读延迟 | ms | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsDataWriteIoBytes | cfs 服务端写带宽 | cfs 服务端写带宽 | KBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsDataWriteIoLatency | cfs 写延迟 | cfs 写延迟 | ms | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceCfsStrageUsageGb | cfs 存储数据容量 | cfs 存储数据容量 | GBytes | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskIoUtil | 磁盘 ioutil | 磁盘 ioutil | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskIoWait | 磁盘 iowait | 磁盘 iowait | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskReadByte | 磁盘读取带宽 | 磁盘读取带宽 | MBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskReadIops | 磁盘读取 iops | 磁盘读取 iops | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskUsageRadio | 系统盘分区利用率 | 系统盘分区利用率 | % | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskWriteByte | 磁盘写入带宽 | 磁盘写入带宽 | MBytes/s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceDiskWriteIops | 磁盘写入 iops | 磁盘写入 iops | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Servicehttpqps | http 调用 qps | 服务的 http 每秒请求数 | Count/s | AppId Source SubUin TaskId | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
Servicehttpqpslimit | http 调用被限制qps | 服务的 http 每秒被限制请求数 | Count/s | AppId Source SubUin TaskId | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
ServiceTiemsCurrentRequests | 并发请求数 | 并发请求数 | Count | AppId Source SubUin TaskId | [ 10s, max ] [ 60s, max ] [ 300s, max ] [ 3600s, max ] [ 86400s, max ] |
ServiceGpuMemValue | 显存使用量 | 显存使用量 | MBytes | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsTokenThroughput | 每分钟处理 Token 数 | 每分钟处理 Token 数 | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsTokenThroughputInput | 每分钟处理 Token 数,仅输入 | 每分钟处理输入 Token 数 | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsTokenThroughputOutput | 每分钟处理 Token 数,仅输出 | 每分钟处理生成Token 数 | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsFirstTokenLatency | 首 Token 时延 | 首 Token 时延 | s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsNonFirstTokenLatency | 非首 Token 时延 | 非首 Token 时延 | s | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsProcessingRequestCount | 处理中请求数 | 处理中请求数 | Count | AppId Source SubUin TaskId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
ServiceEmsQueuingRequestCount | 排队中请求数 | 排队中请求数 | Count | AppId Source SubUin TaskId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
ServiceEmsTotalProcessedTokens | 已处理 Token 总量 | 已处理 Token 总量 | Count | AppId Source SubUin TaskId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
ServiceEmsTotalProcessedTokensInput | 已处理 Token 总量,仅输入 | 输入token 总量 | Count | AppId Source SubUin TaskId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
ServiceEmsTotalProcessedTokensOutput | 已处理 Token 总量,仅输出 | 生成 Token 总量 | Count | AppId Source SubUin TaskId | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
ServiceEmsAverageLengthInput | 输入平均长度(Token) | 输入平均长度(Token) | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
ServiceEmsAverageLengthOutput | 输出平均长度(Token) | 输出平均长度(Token) | Count | AppId Source SubUin TaskId | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuMemValue | 显存使用量 | 显存使用量 | MBytes | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsTokenThroughput | 每分钟处理 Token 数 | 每分钟处理 Token 数 | Count | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsTokenThroughputInput | 每分钟处理 Token 数,仅输入 | 每分钟处理 Token 数,仅输入 | Count | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsTokenThroughputOutput | 每分钟处理 Token 数,仅输出 | 每分钟处理 Token 数,仅输出 | Count | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsFirstTokenLatency | 首 Token 时延 | 首 Token 时延 | s | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsNonFirstTokenLatency | 非首 Token 时延 | 非首 Token 时延 | s | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsProcessingRequestCount | 处理中请求数 | 处理中请求数 | Count | AppId InstanceId Source SubUin | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
EmsQueuingRequestCount | 排队中请求数 | 排队中请求数 | Count | AppId InstanceId Source SubUin | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
EmsTotalProcessedTokens | 已处理 Token 总量 | 已处理 Token 总量 | Count | AppId InstanceId Source SubUin | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
EmsTotalProcessedTokensInput | 已处理 Token 总量,仅输入 | 已处理 Token 总量,仅输入 | Count | AppId InstanceId Source SubUin | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
EmsTotalProcessedTokensOutput | 已处理 Token 总量,仅输出 | 已处理 Token 总量,仅输出 | Count | AppId InstanceId Source SubUin | [ 10s, sum ] [ 60s, sum ] [ 300s, sum ] [ 3600s, sum ] [ 86400s, sum ] |
EmsAverageLengthInput | 输入平均长度(Token) | 输入平均长度(Token) | Count | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
EmsAverageLengthOutput | 输出平均长度 (Token) | 输出平均长度 (Token) | Count | AppId InstanceId Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Fp16EngineActivity | FP16活跃时间比 | FP16活跃时间比 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Fp32EngineActivity | FP32活跃时间比 | FP32活跃时间比 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
Fp64EngineActivity | FP64活跃时间比 | FP64活跃时间比 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
NvlinkBandwidth | nvlink 传输速率 | nvlink 传输速率 | Bytes/s | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
PcieBandwidth | PCIe 总线传输速率 | PCle 总线传输速率 | Bytes/s | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
SmActivity | SM 活跃状态时间比 | SM 活跃状态时间比 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
TensorActivity | Tensor 活跃状态时间比 | Tensor 活跃状态时间比 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DcgmFiDevFbUsed | 显存使用量 | 显存使用量 | MBytes | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DcgmFiDevGpuUtil | GPU 使用率 | GPU 使用率 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
DcgmFiDevMemCopyUtil | GPU 显存使用率 | GPU 显存使用率 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuDecUtil | GPU 解码使用率 | GPU 解码使用率 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuEncUtil | GPU 编码器使用率 | GPU 编码器使用率 | % | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuMemoryClock | GPU 显存频率 | GPU 显存频率 | S | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuNvlinkRxMb | nvlink 接收数据量 | nvlink 接收数据量 | Mbps | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuNvlinkTxMb | nvlink 发送数据量 | nvlink 发送数据量 | Mbps | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuPcieRxMb | pcie 接收数据量 | pcie 接收数据量 | Mbps | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuPcieTxMb | pcie 发送数据量 | pcie 发送数据量 | Mbps | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
GpuSmClock | SM 时钟频率 | SM 时钟频率 | S | Appld InstanceGpuNum Source SubUin | [ 10s, avg ] [ 60s, avg ] [ 300s, avg ] [ 3600s, avg ] [ 86400s, avg ] |
各维度对应参数总览
参数名称 | 维度名称 | 维度解释 | 格式 |
Instances.N.Dimensions.0.Name | AppId | 账号基本信息 APPID 的维度名称 | 输入 String 类型维度名称:AppId(SDK 调用时会自动获取,无需传参) |
Instances.N.Dimensions.0.Value | AppId | 账号基本信息 APPID | 输入 ID,例如:1231231231(SDK 调用时会自动获取,无需传参) |
Instances.N.Dimensions.1.Name | SubUin | 子账号 ID 的维度名称 | 输入 String 类型维度名称:SubUin |
Instances.N.Dimensions.1.Value | SubUin | 子账号 ID | 输入 ID,例如:100001231231 |
Instances.N.Dimensions.2.Name | Source | 创建来源的维度名称 | 输入 String 类型维度名称:Source |
Instances.N.Dimensions.2.Value | Source | 创建来源 | 输入来源,例如:normal(默认请使用此值) |
Instances.N.Dimensions.3.Name | InstanceId | 在线服务实例 ID 的维度名称 | 输入 String 类型维度名称:InstanceId |
Instances.N.Dimensions.3.Value | InstanceId | 在线服务实例 ID | 输入具体实例 ID,例如: ms-2tgmq6ms-1-5f96656956-272wq |
Instances.N.Dimensions.4.Name | TaskId | 在线服务 ID 的维度名称 | 输入 String 类型维度名称:TaskId |
Instances.N.Dimensions.4.Value | TaskId | 在线服务 ID | 输入 ID,例如:ms-2tgmq6ms-1 |
Instances.N.Dimensions.5.Name | ServiceGroupId | 在线服务服务组 ID 的维度名称 | 输入 String 类型维度名称:ServiceGroupId |
Instances.N.Dimensions.5.Value | ServiceGroupId | 在线服务服务组 ID | 输入 ID,例如:ms-2tgmq6ms |
Instances.N.Dimensions.6.Name | InstanceGpuNum | 在线服务实例使用的 GPU 卡号(仅限 GPU 整卡任务)的维度名称 | 输入 String 类型维度名称:InstanceGpuNum |
Instances.N.Dimensions.6.Value | InstanceGpuNum | 在线服务实例使用的GPU卡号(仅限 GPU 整卡任务) | 实例 ID 拼接 GPU 卡号/avg,输入具体实例 ID,例如:ms-2tgmq6ms-1-5f96656956-272wq-0 |
入参说明
查询在线服务指标监控数据,取值如下:
&Namespace=QCE/TI_MODEL
&Instances.N.Dimensions.0.Name=AppId
&Instances.N.Dimensions.0.Value=具体的账号 ID
&Instances.N.Dimensions.1.Name=SubUin
&Instances.N.Dimensions.1.Value=具体的子账号 ID
&Instances.N.Dimensions.2.Name=Source
&Instances.N.Dimensions.2.Value=具体的创建来源
&Instances.N.Dimensions.3.Name=InstanceId
&Instances.N.Dimensions.3.Value=在线服务实例 ID
&Instances.N.Dimensions.4.Name=TaskId
&Instances.N.Dimensions.4.Value=具体的在线服务 ID
&Instances.N.Dimensions.5.Name=ServiceGroupId
&Instances.N.Dimensions.5.Value=具体的在线服务服务组 ID
&Instances.N.Dimensions.6.Name=InstanceGpuNum
&Instances.N.Dimensions.6.Value=在线服务实例使用的 GPU 卡号