前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Prometheus监控神器-Kubernetes篇(三)

Prometheus监控神器-Kubernetes篇(三)

作者头像
没有故事的陈师傅
发布2021-11-12 15:15:57
4820
发布2021-11-12 15:15:57
举报
文章被收录于专栏:运维开发故事运维开发故事

在Kubernetes中手动方式部署Prometheus联邦.

monitor-prom

当我们有多个Kubernetes集群的时候,这个时候就需要需要指标汇总的需求了,如上图一样,我们假定在外部部署一个Prometheus的Federate,然后去采集当前k8s中的kube-system与default俩个 namespace。

环境

我的本地环境使用的 sealos 一键部署,主要是为了便于测试。

OS

Kubernetes

HostName

IP

Service

Ubuntu 18.04

1.17.7

sealos-k8s-m1

192.168.1.151

node-exporter prometheus-federate-0

Ubuntu 18.04

1.17.7

sealos-k8s-m2

192.168.1.152

node-exporter grafana alertmanager-0

Ubuntu 18.04

1.17.7

sealos-k8s-m3

192.168.1.150

node-exporter alertmanager-1

Ubuntu 18.04

1.17.7

sealos-k8s-node1

192.168.1.153

node-exporter prometheus-0 kube-state-metrics

Ubuntu 18.04

1.17.7

sealos-k8s-node2

192.168.1.154

node-exporter prometheus-1

Ubuntu 18.04

1.17.7

sealos-k8s-node2

192.168.1.155

node-exporter prometheus-2

部署 Prometheus联邦集群

创建prometheus-federate数据目录

代码语言:javascript
复制
# 在m1上执行
mkdir /data/prometheus-federate/
chown -R 65534:65534 /data/prometheus-federate/

创建Prometheus联邦 StorageClass 配置文件

代码语言:javascript
复制
cd /data/manual-deploy/prometheus/
cat prometheus-federate-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: prometheus-federate-lpv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

创建Prometheus联邦pv配置文件

代码语言:javascript
复制
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-federate-lpv-0
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: prometheus-federate-lpv
  local:
    path: /data/prometheus-federate
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - sealos-k8s-m1

创建Prometheus联邦configmap配置文件

代码语言:javascript
复制
cat prometheus-federate-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-federate-config
  namespace: kube-system
data:
  alertmanager_rules.yaml: |
    groups:
    - name: example
      rules:
      - alert: InstanceDown
        expr: up == 0
        for: 1m
        labels:
          severity: page
        annotations:
          summary: "Instance {{ $labels.instance }} down"
          description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
      - alert: NodeMemoryUsage
        expr: (node_memory_MemTotal_bytes -(node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 80
        for: 1m
        labels:
          team: ops
        annotations:
          summary: "cluster:{{ $labels.cluster }} {{ $labels.instance }}: High Memory usage detected"
          description: "{{ $labels.instance }}: Memory usage is above 55% (current value is: {{ $value }}"
  prometheus.yml: |
    global:
      scrape_interval:     30s
      evaluation_interval: 30s
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
            - alertmanager-0.alertmanager-operated:9093
            - alertmanager-1.alertmanager-operated:9093      
    rule_files:
      - "/etc/prometheus/alertmanager_rules.yaml"
    scrape_configs:
      - job_name: 'federate'
        scrape_interval: 30s
        honor_labels: true
        metrics_path: '/federate'
        params:
          'match[]':
            - '{job=~"kubernetes.*"}'
            - '{job="prometheus"}'
        static_configs:
          - targets:
            - 'prometheus-0.prometheus:9090'
            - 'prometheus-1.prometheus:9090'
            - 'prometheus-2.prometheus:9090'

创建Prometheus联邦的statefulse文件

代码语言:javascript
复制
cat prometheus-federate-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus-federate
  namespace: kube-system
  labels:
    k8s-app: prometheus-federate
    kubernetes.io/cluster-service: "true"
spec:
  serviceName: "prometheus-federate"
  podManagementPolicy: "Parallel"
  replicas: 1
  selector:
    matchLabels:
      k8s-app: prometheus-federate
  template:
    metadata:
      labels:
        k8s-app: prometheus-federate
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: k8s-app
                operator: In
                values:
                - prometheus-federate
            topologyKey: "kubernetes.io/hostname"
      priorityClassName: system-cluster-critical
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: prometheus-federate-configmap-reload
        image: "jimmidyson/configmap-reload:v0.4.0"
        imagePullPolicy: "IfNotPresent"
        args:
          - --volume-dir=/etc/config
          - --webhook-url=http://localhost:9091/-/reload
        volumeMounts:
          - name: config-volume
            mountPath: /etc/config
            readOnly: true
        resources:
          limits:
            cpu: 10m
            memory: 10Mi
          requests:
            cpu: 10m
            memory: 10Mi
        securityContext:
            runAsUser: 0
            privileged: true
      - image: prom/prometheus:v2.20.0
        imagePullPolicy: IfNotPresent
        name: prometheus
        command:
          - "/bin/prometheus"
        args:
          - "--web.listen-address=0.0.0.0:9091"
          - "--config.file=/etc/prometheus/prometheus.yml"
          - "--storage.tsdb.path=/prometheus"
          - "--storage.tsdb.retention=24h"
          - "--web.console.libraries=/etc/prometheus/console_libraries"
          - "--web.console.templates=/etc/prometheus/consoles"
          - "--web.enable-lifecycle"
        ports:
          - containerPort: 9091
            protocol: TCP
        volumeMounts:
          - mountPath: "/prometheus"
            name: prometheus-federate-data
          - mountPath: "/etc/prometheus"
            name: config-volume
        readinessProbe:
          httpGet:
            path: /-/ready
            port: 9091
          initialDelaySeconds: 30
          timeoutSeconds: 30
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: 9091
          initialDelaySeconds: 30
          timeoutSeconds: 30
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 1000m
            memory: 2500Mi
        securityContext:
            runAsUser: 0
            privileged: true
      serviceAccountName: prometheus
      volumes:
        - name: config-volume
          configMap:
            name: prometheus-federate-config
  volumeClaimTemplates:
    - metadata:
        name: prometheus-federate-data
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: "prometheus-federate-lpv"
        resources:
          requests:
            storage: 5Gi       

创建Prometheus联邦的svc文件

代码语言:javascript
复制
cat prometheus-service-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: kube-system
spec:
  ports:
    - name: prometheus
      port: 9090
      targetPort: 9090
  selector:
    k8s-app: prometheus
  clusterIP: None

部署

代码语言:javascript
复制
cd /data/manual-deploy/prometheus/
prometheus-federate-configmap.yaml
prometheus-federate-pv.yaml
prometheus-federate-service-statefulset.yaml
prometheus-federate-statefulset.yaml
prometheus-federate-storageclass.yaml
kubectl apply -f prometheus-federate-storageclass.yaml
kubectl apply -f prometheus-federate-pv.yaml
kubectl apply -f prometheus-federate-configmap.yaml
kubectl apply -f prometheus-federate-statefulset.yaml
kubectl apply -f prometheus-federate-service-statefulset.yaml

验证

代码语言:javascript
复制
# pv
kubectl -n kube-system get pvc |grep federate
prometheus-federate-data-prometheus-federate-0   Bound    prometheus-federate-lpv-0   10Gi RWO   prometheus-federate-lpv 4h
kubectl -n kube-system get pod |grep federate
prometheus-federate-0                      2/2     Running   0          2d4h

对此,联邦的配置就完成了,可以在浏览器中访问192.168.1.151:9091查看相应的targets信息,以及配置的rules规则,触发下警报,看看Alertmanager集群已经部署成功了。

k8s-federate

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-11-08,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 运维开发故事 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 环境
  • 部署 Prometheus联邦集群
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档