Overview
Tencent Cloud TKE offers monitoring data collection and display capabilities at five levels: cluster, node, workload, Pod, and Container. A robust monitoring environment ensures high reliability, availability, and performance for Tencent Cloud TKE. By configuring alarms, you can collect monitoring data from various dimensions for different resources, allowing you to easily understand resource usage and swiftly pinpoint errors.
Collecting monitoring data helps you establish a baseline for the normal performance of your container clusters. By measuring the performance of container clusters under different load conditions and at various times, and collecting historical monitoring data, you can gain a clear understanding of the normal performance of your clusters and services. This enables you to quickly determine if the services are operating abnormally based on current monitoring data and promptly identify solutions to any issues. For example, you can monitor CPU utilization, memory usage, and disk I/O for your services.
Monitoring
For guidance on using the monitoring features of the container service, please refer to View Monitoring Data.
For the currently covered monitoring metrics, please refer to Monitoring and Alarm Metrics List.
Alarm
To promptly detect anomalies in your container service and ensure the stability and reliability of your business, it is recommended that you configure necessary alarms for all production clusters. For guidance on configuring alarms, please refer to Setting Alarms.
For the currently covered alarm metrics, please refer to Monitoring and Alarm Metrics List.
Note
The monitoring and alarm features provided by the container service primarily cover core metrics or events of Kubernetes objects. Please use them in conjunction with the basic resource monitoring provided by the Tencent Cloud Observability Platform (such as cloud servers, block storage, load balancing, etc.) to ensure more refined metric coverage.
If the basic monitoring capabilities provided by Tencent Cloud Container Service do not meet your requirements, you can use the Prometheus Monitoring service offered by Tencent Cloud. Prometheus Monitoring is dedicated to providing lightweight, stable, and highly available services. It retains the native features of Prometheus, supports custom metric collection, multi-cluster monitoring, reporting of millions of metrics, excellent visualization capabilities based on Grafana with default panels, stable multi-channel alarming capabilities, and a non-intrusive architecture that minimally impacts your cluster resources. The highly customizable configuration options help you build the most suitable monitoring platform for cloud-native scenarios. For specific operations, please refer to Tencent Cloud Prometheus One-Click Association with Container Service Monitoring.