实战Prometheus

LA0WAN9

发布于 2021-12-14 09:04:03

4740

发布于 2021-12-14 09:04:03

文章被收录于专栏：火丁笔记

最近手头的项目开始从 PHP，Lua 迁移到 Golang，心想正好趁此机会夯实监控，提到 Golang 的监控，不得不说 prometheus 已经是标配，在 Golang 里集成起来非常简单：

package main

import (
        "net/http"

        "github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
        http.Handle("/metrics", promhttp.Handler())
        http.ListenAndServe(":6060", nil)
}

如果想在本地 docker 里部署 prometheus，那么只需一条 docker run 命令：

shell> docker run -p 9090:9090 prom/prometheus

运行后打开浏览器浏览 http://localhost:9090/targets 即可，这里显示了相关的监控信息，缺省情况下，监控了 prometheus 本身。

虽然在本地 docker 里部署非常简单，但是如果想在 kubenetes 里部署的话却是另一番经景象了，加之官方文档语焉不详，以至于我几次想中途而废，还好最后坚持下来了，本文记录了我在部署过程中遇到的一些坑坑洼洼以及解决方法。

关于配置的问题

Prometheus 缺省的配置文件是「/etc/prometheus/prometheus.yml」，如果我们要修改配置文件的话，那么按照官方文档里的说明，需要自定义一个 Dockerfile 文件：

FROM prom/prometheus
ADD prometheus.yml /etc/prometheus/

然后再构建新的镜像：

shell> docker build -t my-prometheus .
shell> docker run -p 9090:9090 my-prometheus

不得不说有点繁琐，实际上利用 kubenetes 的 ConfigMap 即可：

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus
  namespace: bpd-ie
data:
  prometheus.yml: |
    # my global config
    global:
      scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).

    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          # - alertmanager:9093

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"

    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=` to any timeseries scraped from this config.
      - job_name: 'prometheus'

        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.

        static_configs:
        - targets: ['localhost:9090']

至于如何把 ConfigMap 挂到容器里，可以参考后面的配置。

关于服务发现的问题

传统监控软件往往采用的是 push 模式，而 prometheus 采用的是 pull 模式。好处是架构简单，被监控的节点不需要部署 agent 之类的代理进程，坏处是 prometheus 必须知道所有需要被监控的节点，比如缺省配置（/etc/prometheus/prometheus.yml）就会通过 static_configs 抓取 prometheus 服务本身的节点信息：

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']

如果需要被监控的节点比较固定的话，那么通过 static_configs 来硬编码倒也无妨，不过在 kubenetes 中，各个业务需要被监控的容器个数随时可能会发生变化，相应的容器地址也随时可能会发生变化，此时如果再通过 static_configs 来硬编码的话，那么无疑是自讨苦吃，相对而言更合理的方法是使用 kubernetes_sd_config 打通服务发现机制，从而实现自动配置节点信息，不过前提条件是我们必须先在 kubenetes 中配置 RBAC 信息：

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

不过我的 kubenetes 账号权限不够，配置 RBAC 信息的时候会报错：

clusterrolebindings.rbac.authorization.k8s.io “prometheus” is forbidden: User “…” cannot patch resource “clusterrolebindings” in API group “rbac.authorization.k8s.io” at the cluster scope

找 kubenetes 管理员来配置固然可以解决问题，但是以后有相关问题的话会不可避免的依赖上别人了，最好还是能自己解决问题：既然 kubernetes_sd_config 有困难，那么我们不妨考虑换一种服务发现机制，比如说 dns_sd_config，通过域名解析动态获取节点：

scrape_configs:
  - job_name: 'foo'
  dns_sd_configs:
  - names: ['foo.default.svc.cluster.local']
    type: A
    port: 6060

当然，我们还需要在 kubenetes 里把需要被监控的业务配置成 Headless Service，以便 kubenets 能够为每一个 POD 提供一个稳定的并且唯一的网络标识，也就是内网域名：

apiVersion: v1
kind: Service
metadata:
  name: foo
  namespace: default
spec:
  clusterIP: None

说明：关于 Headeless Service 和内网域名的相关知识，如果不清楚，那么可以参考我以前写的文章：「手把手教你用ETCD」，里面有详细的介绍。

此外，说一点题外话，假如使用 kubernetes_sd_config 作为服务发现方式，设想下面一个场景：集群里有非常多的节点，但是你只想监控其中的一部分，那么如何配置呢？答案是通过 kubenetes 的 annotations 功能，标识出你要监控的节点，然后在 prometheus 里 relable，详细的介绍可以参考 stackoverflow 上的相关内容。

关于数据持久化的问题

一般来说，我们通过 kubenetes 部署的都是一些无状态的服务，而对于 prometheus 服务而言，它应该是一个有状态的服务（StatefulSet），也就是说需要考虑数据持久化，否则一重启，监控信息的历史记录都丢失了，肯定不是我们所希望看到的：

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  namespace: default
  labels:
    app: prometheus
spec:
  serviceName: prometheus
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      dnsPolicy: ClusterFirst
      containers:
      - name: prometheus
        image: prom/prometheus:v2.22.0
        imagePullPolicy: Always
        ports:
        - name: http
          containerPort: 9090
          protocol: TCP
        volumeMounts:
        - name: datadir
          mountPath: /prometheus
        - name: configfile
          mountPath: "/etc/prometheus/prometheus.yml"
          subPath: prometheus.yml
        resources:
          limits:
            cpu: "1"
            memory: 512Mi
          requests:
            cpu: "1"
            memory: 512Mi
      volumes:
      - name: configfile
        configMap:
          name: prometheus
      # securityContext:
      #   runAsNonRoot: true
      #   runAsUser: 65534
      #   runAsGroup: 65534
      #   fsGroup: 65534
  volumeClaimTemplates:
  - metadata:
      name: datadir
    spec:
      storageClassName: ...
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi

不过当我配置 StatefulSet 的时候，kubenetes 却报错了：

pod has unbound immediate PersistentVolumeClaims

相关描述无法直观看出问题出在哪，好在还可以查日志：

shell> kubectl logs prometheus-0

结果看到了真正的错误原因：

open /prometheus/queries.active: permission denied

以此信息为关键字去 github 上搜索，可以找到相关的 issue，确认是权限问题。网上能查到一些解决方法，比如 GoogleCloudPlatform 是通过在 initContainers 里执行 chmod 解决的，不过更好的方法是通过 securityContext 设置非 root 权限：

securityContext:
  runAsNonRoot: true
  runAsUser: 65534
  runAsGroup: 65534
  fsGroup: 65534

至于为什么是 65534，可以本地运行 docker 容器后看看使用的是什么账号：

shell> docker exec $(docker ps -qf ancestor=prom/prometheus) id
uid=65534(nobody) gid=65534(nogroup)

其它问题

差点忘了说 grafana，既然说 prometheus，怎么能忘了 grafana 呢！grafana 对 prometheus 的支持很好，使用起来非常简单，按照官问文档的说明配置即可，没有什么可说的，我要说的是关于 Dashboard 的选择，现在最流行的是 Go Metrics（10826），多数时候，它也是最好的，不过它有一个缺点：它是基于 kubenetes 里的 namespace / pod 筛选的，如果你没有使用基于 kubenetes 的服务发现机制，比如本文使用的是基于 dns 的服务发现机制，那么筛选功能就失效了，基于此，我做了一个修改版本的 Go Metrics（13240），它是基于 job / instance 筛选的，效果如下：