Prometheus是由SoundCloud公司开发的开源监控系统,是继Kubernetes之后CNCF第2个毕业的项目,在容器和微服务领域得到了广泛应用。Prometheus的主要特点如下:
Prometheus生态系统由各种组件组成,用于功能的扩充。
Prometheus的核心组件Prometheus Server的主要功能包括:
Prometheus 直接从jobs接收或者通过中间的 Pushgateway 网关被动获取指标数据,在本地存储所有获取的指标数据,并对这些数据进行一些规则整理,用来生成一些聚合数据或者报警信息,然后可以通过 Grafana 或者其他工具来可视化这些数据。
其工作流程大致如下:
Prometheus作为监控系统主要在以下各层面实现监控:
Prometheus从根本上将所有数据存储为时间序列:属于相同度量标准和同一组标注尺寸的时间戳值流。除了存储的时间序列之外,Prometheus可能会生成临时派生时间序列作为查询的结果。
每个时间序列都是由度量标准名称和一组键值对(也称为标签)组成唯一标识。度量名称指定被测量的系统的特征(例如:http_requests_total-接收到的HTTP请求的总数)。它可以包含ASCII字母和数字,以及下划线和冒号。它必须匹配正则表达式[a-zA-Z_:][a-zA-Z0-9_:]*。
标签启用Prometheus的维度数据模型:对于相同度量标准名称,任何给定的标签组合都标识该度量标准的特定维度实例。查询语言允许基于这些维度进行筛选和聚合。更改任何标签值(包括添加或删除标签)都会创建新的时间序列。标签名称可能包含ASCII字母,数字以及下划线。他们必须匹配正则表达式[a-zA-Z_][a-zA-Z0-9_]*。以__开始的标签名称保留给供内部使用。
实际的时间序列,每个序列包括:一个 float64 的值和一个毫秒级的时间戳。
给定度量标准名称和一组标签,时间序列通常使用以下格式来标识:<metric name>{<label name>=<label value>, ...}
例如,时间序列的度量名称为api_http_requests_total,标签method=”POST”和handler=”/messages”,则标记为:
api_http_requests_total{method="POST", handler="/messages"}
Prometheus 客户端库主要提供Counter、Gauge、Histogram和Summery四种主要的 metric 类型:
Counter是一种累加的度量,它的值只能增加或在重新启动时重置为零。例如,可以使用计数器来表示提供的请求数,已完成的任务或错误的数量。不要使用计数器来表达可减少的值。例如,不要使用Counter来计算当前正在运行的进程的数量,而是使用Gauge。
Gauge表示单个数值,表达可以任意地上升和下降的度量。Gauge通常用于测量值,例如温度或当前的内存使用情况,但也可以表达上升和下降的“计数”,如正在运行的goroutines的数量。
Histogram样本观测(例如:请求持续时间或响应大小),并将它们计入配置的桶中。它也提供所有观测值的总和。具有<basename>基本度量标准名称的histogram的在获取数据期间会显示多个时间序列:
在Prometheus中,可以获取数据的端点被称为实例(instance),通常对应于一个单一的进程。具有相同目的的实例集合(例如为了可伸缩性或可靠性而复制的进程)称为作业(job)。
当Prometheus获取目标时,它会自动附加一些标签到所获取的时间序列中,以识别获取目标:
如果这些标签中的任何一个已经存在于抓取的数据中,则行为取决于honor_labels配置选项。对于每个实例抓取,Prometheus会在以下时间序列中存储一个样本:
up时间序列是实例可用性的监控。
[root@k8smaster01 study]# vi monitor-namespace.yaml
1 apiVersion: v1
2 kind: Namespace
3 metadata:
4 name: monitoring
5
[root@k8smaster01 study]# kubectl create -f monitor-namespace.yaml
[root@k8smaster01 study]# git clone https://github.com/prometheus/prometheus
[root@k8smaster01 ~]# cd prometheus/documentation/examples/
[root@k8smaster01 examples]# vi rbac-setup.yml
1 apiVersion: rbac.authorization.k8s.io/v1beta1
2 kind: ClusterRole
3 metadata:
4 name: prometheus
5 rules:
6 - apiGroups: [""]
7 resources:
8 - nodes
9 - nodes/proxy
10 - services
11 - endpoints
12 - pods
13 verbs: ["get", "list", "watch"]
14 - apiGroups:
15 - extensions
16 resources:
17 - ingresses
18 verbs: ["get", "list", "watch"]
19 - nonResourceURLs: ["/metrics"]
20 verbs: ["get"]
21 ---
22 apiVersion: v1
23 kind: ServiceAccount
24 metadata:
25 name: prometheus
26 namespace: monitoring #修改命名空间
27 ---
28 apiVersion: rbac.authorization.k8s.io/v1beta1
29 kind: ClusterRoleBinding
30 metadata:
31 name: prometheus
32 roleRef:
33 apiGroup: rbac.authorization.k8s.io
34 kind: ClusterRole
35 name: prometheus
36 subjects:
37 - kind: ServiceAccount
38 name: prometheus
39 namespace: monitoring #修改命名空间
40
[root@k8smaster01 examples]# kubectl create -f rbac-setup.yml
[root@k8smaster01 examples]# cat prometheus-kubernetes.yml | grep -v ^$ | grep -v "#" >> prometheus-config.yaml
[root@k8smaster01 examples]# vi prometheus-config.yaml
1 apiVersion: v1
2 kind: ConfigMap
3 metadata:
4 name: prometheus-server-conf
5 labels:
6 name: prometheus-server-conf
7 namespace: monitoring #修改命名空间
8 data:
9 prometheus.yml: |-
10 global:
11 scrape_interval: 10s
12 evaluation_interval: 10s
13
14 scrape_configs:
15 - job_name: 'kubernetes-apiservers'
16 kubernetes_sd_configs:
17 - role: endpoints
18 scheme: https
19 tls_config:
20 ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
21 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
22 relabel_configs:
23 - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
24 action: keep
25 regex: default;kubernetes;https
26
27 - job_name: 'kubernetes-nodes'
28 scheme: https
29 tls_config:
30 ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
31 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
32 kubernetes_sd_configs:
33 - role: node
34 relabel_configs:
35 - action: labelmap
36 regex: __meta_kubernetes_node_label_(.+)
37 - target_label: __address__
38 replacement: kubernetes.default.svc:443
39 - source_labels: [__meta_kubernetes_node_name]
40 regex: (.+)
41 target_label: __metrics_path__
42 replacement: /api/v1/nodes/${1}/proxy/metrics
43
44 - job_name: 'kubernetes-cadvisor'
45 scheme: https
46 tls_config:
47 ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
48 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
49 kubernetes_sd_configs:
50 - role: node
51 relabel_configs:
52 - action: labelmap
53 regex: __meta_kubernetes_node_label_(.+)
54 - target_label: __address__
55 replacement: kubernetes.default.svc:443
56 - source_labels: [__meta_kubernetes_node_name]
57 regex: (.+)
58 target_label: __metrics_path__
59 replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
60
61 - job_name: 'kubernetes-service-endpoints'
62 kubernetes_sd_configs:
63 - role: endpoints
64 relabel_configs:
65 - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
66 action: keep
67 regex: true
68 - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
69 action: replace
70 target_label: __scheme__
71 regex: (https?)
72 - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
73 action: replace
74 target_label: __metrics_path__
75 regex: (.+)
76 - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
77 action: replace
78 target_label: __address__
79 regex: ([^:]+)(?::\d+)?;(\d+)
80 replacement: $1:$2
81 - action: labelmap
82 regex: __meta_kubernetes_service_label_(.+)
83 - source_labels: [__meta_kubernetes_namespace]
84 action: replace
85 target_label: kubernetes_namespace
86 - source_labels: [__meta_kubernetes_service_name]
87 action: replace
88 target_label: kubernetes_name
89
90 - job_name: 'kubernetes-services'
91 metrics_path: /probe
92 params:
93 module: [http_2xx]
94 kubernetes_sd_configs:
95 - role: service
96 relabel_configs:
97 - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
98 action: keep
99 regex: true
100 - source_labels: [__address__]
101 target_label: __param_target
102 - target_label: __address__
103 replacement: blackbox-exporter.example.com:9115
104 - source_labels: [__param_target]
105 target_label: instance
106 - action: labelmap
107 regex: __meta_kubernetes_service_label_(.+)
108 - source_labels: [__meta_kubernetes_namespace]
109 target_label: kubernetes_namespace
110 - source_labels: [__meta_kubernetes_service_name]
111 target_label: kubernetes_name
112
113 - job_name: 'kubernetes-ingresses'
114 kubernetes_sd_configs:
115 - role: ingress
116 relabel_configs:
117 - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_probe]
118 action: keep
119 regex: true
120 - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
121 regex: (.+);(.+);(.+)
122 replacement: ${1}://${2}${3}
123 target_label: __param_target
124 - target_label: __address__
125 replacement: blackbox-exporter.example.com:9115
126 - source_labels: [__param_target]
127 target_label: instance
128 - action: labelmap
129 regex: __meta_kubernetes_ingress_label_(.+)
130 - source_labels: [__meta_kubernetes_namespace]
131 target_label: kubernetes_namespace
132 - source_labels: [__meta_kubernetes_ingress_name]
133 target_label: kubernetes_name
134
135 - job_name: 'kubernetes-pods'
136 kubernetes_sd_configs:
137 - role: pod
138 relabel_configs:
139 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
140 action: keep
141 regex: true
142 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
143 action: replace
144 target_label: __metrics_path__
145 regex: (.+)
146 - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
147 action: replace
148 regex: ([^:]+)(?::\d+)?;(\d+)
149 replacement: $1:$2
150 target_label: __address__
151 - action: labelmap
152 regex: __meta_kubernetes_pod_label_(.+)
153 - source_labels: [__meta_kubernetes_namespace]
154 action: replace
155 target_label: kubernetes_namespace
156 - source_labels: [__meta_kubernetes_pod_name]
157 action: replace
158 target_label: kubernetes_pod_name
159
[root@k8smaster01 examples]# kubectl create -f prometheus-config.yaml
[root@k8smaster01 examples]# vi prometheus-deployment.yml
1 apiVersion: apps/v1beta2
2 kind: Deployment
3 metadata:
4 labels:
5 name: prometheus-deployment
6 name: prometheus-server
7 namespace: monitoring
8 spec:
9 replicas: 1
10 selector:
11 matchLabels:
12 app: prometheus-server
13 template:
14 metadata:
15 labels:
16 app: prometheus-server
17 spec:
18 containers:
19 - name: prometheus-server
20 image: prom/prometheus:v2.14.0
21 command:
22 - "/bin/prometheus"
23 args:
24 - "--config.file=/etc/prometheus/prometheus.yml"
25 - "--storage.tsdb.path=/prometheus/"
26 - "--storage.tsdb.retention=72h"
27 ports:
28 - containerPort: 9090
29 protocol: TCP
30 volumeMounts:
31 - name: prometheus-config-volume
32 mountPath: /etc/prometheus/
33 - name: prometheus-storage-volume
34 mountPath: /prometheus/
35 serviceAccountName: prometheus
36 imagePullSecrets:
37 - name: regsecret
38 volumes:
39 - name: prometheus-config-volume
40 configMap:
41 defaultMode: 420
42 name: prometheus-server-conf
43 - name: prometheus-storage-volume
44 emptyDir: {}
45
[root@k8smaster01 examples]# kubectl create -f prometheus-deployment.yml
提示:若需要持久存储Prometheus,可提前创建相应sc和pvc,sc《044.集群存储-StorageClass》,PVC可参考如下:
[root@k8smaster01 examples]# vi prometheus-pvc.yaml
1 apiVersion: v1
2 kind: PersistentVolumeClaim
3 metadata:
4 name: prometheus-pvc
5 namespace: monitoring
6 annotations:
7 volume.beta.kubernetes.io/storage-class: ghstorageclass
8 spec:
9 accessModes:
10 - ReadWriteMany
11 resources:
12 requests:
13 storage: 5Gi
[root@k8smaster01 examples]# kubectl create -f prometheus-pvc.yaml
将prometheus-deployment.yml存储部分修改为:
1 ……
2 - name: prometheus-storage-volume
3 persistentVolumeClaim:
4 claimName: prometheus-pvc
5 ……
6
[root@k8smaster01 examples]# vi prometheus-service.yaml
apiVersion: v1 kind: Service metadata: labels: app: prometheus-service name: prometheus-service namespace: monitoring spec: type: NodePort selector: app: prometheus-server ports: - port: 9090 targetPort: 9090 nodePort: 30909
[root@k8smaster01 examples]# kubectl create -f prometheus-service.yaml
[root@k8smaster01 examples]# kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/prometheus-server-fd5479489-q584s 1/1 Running 0 92s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-service NodePort 10.107.69.147 <none> 9090:30909/TCP 29s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-server 1/1 1 1 92s
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-server-fd5479489 1 1 1 92s
浏览器直接访问:http://172.24.8.71:30909/
查看所有Kubernetes集群上的Endpoint通过服务发现的方式自动连接到了Prometheus。
通过图形化界面查看内存。
Prometheus更多配置参考官网:https://prometheus.io/docs/prometheus/latest/configuration/configuration/
[root@uhost ~]# git clone https://github.com/liukuan73/kubernetes-addons
[root@uhost ~]# cd /root/kubernetes-addons/monitor/prometheus+grafana
[root@k8smaster01 prometheus+grafana]# vi grafana.yaml
1 ---
2 apiVersion: v1
3 kind: Service
4 metadata:
5 name: grafana
6 namespace: monitoring
7 labels:
8 app: grafana
9 spec:
10 type: NodePort
11 ports:
12 - port: 3000
13 targetPort: 3000
14 nodePort: 30007
15 selector:
16 app: grafana
17 ---
18 apiVersion: extensions/v1beta1
19 kind: Deployment
20 metadata:
21 labels:
22 app: grafana
23 name: grafana
24 namespace: monitoring
25 spec:
26 replicas: 1
27 revisionHistoryLimit: 2
28 template:
29 metadata:
30 labels:
31 app: grafana
32 spec:
33 containers:
34 - name: gragana
35 image: grafana/grafana:5.0.0
36 imagePullPolicy: IfNotPresent
37 ports:
38 - containerPort: 3000
39 volumeMounts:
40 - mountPath: /var
41 name: grafana-storage
42 env:
43 - name: GF_AUTH_BASIC_ENABLED
44 value: "false"
45 - name: GF_AUTH_ANONYMOUS_ENABLED
46 value: "true"
47 - name: GF_AUTH_ANONYMOUS_ORG_ROLE
48 value: Admin
49 - name: GF_SERVER_ROOT_URL
50 # value: /api/v1/proxy/namespaces/default/services/grafana/
51 value: /
52 readinessProbe:
53 httpGet:
54 path: /login
55 port: 3000
56 volumes:
57 - name: grafana-storage
58 emptyDir: {}
59 nodeSelector:
60 node-role.kubernetes.io/master: "true"
61 # tolerations:
62 # - key: "node-role.kubernetes.io/master"
63 # effect: "NoSchedule"
64
[root@k8smaster01 prometheus+grafana]# kubectl label nodes k8smaster01 node-role.kubernetes.io/master=true
[root@k8smaster01 prometheus+grafana]# kubectl label nodes k8smaster02 node-role.kubernetes.io/master=true
[root@k8smaster01 prometheus+grafana]# kubectl label nodes k8smaster03 node-role.kubernetes.io/master=true
[root@k8smaster01 prometheus+grafana]# kubectl taint node --all node-role.kubernetes.io- #允许Master部署应用
[root@k8smaster01 prometheus+grafana]# kubectl create -f grafana.yaml
[root@k8smaster01 examples]# kubectl get all -n monitoring
浏览器访问:http://172.24.8.71:30007,使用默认用户名admin/admin登录。
Configuration ----> Data Sources。
添加新数据源。
如下添加Prometheus数据源,本环境基于《附012.Kubeadm部署高可用Kubernetes》部署的高可用Kubernetes,存在vip:172.24.8.100,也可使用3.7步骤所测试的Prometheus地址。
保存并测试是否成功。
配置dashboard,本实验使用162号模板,此Dashboard 模板来展示 Kubernetes 集群的监控信息。
选择4.4所添加的Prometheus数据源,用于展示。
可添加普通用户,并配置相应角色。
复制登录链接:http://172.24.8.71:30007/invite/hlhkzz5O3dJj94OlHcKiqN8bPrZt40
进入链接,设置新用户密码并登录:
建议对时区进行设置,其他Grafana更多配置参考:https://grafana.com/docs/grafana/latest/installation/configuration/
登录http://172.24.8.71:30007/,即可查看相应Kubernetes监控了。
本方案参考链接:
https://www.kubernetes.org.cn/4184.html
https://www.kubernetes.org.cn/3418.html
https://www.jianshu.com/p/c2e549480c50