Kubernetes监控方案:heapster+influxdb+grafana详解

1. 工具介绍

1.1 heapster

Heapster是容器集群监控和性能分析工具,天然的支持Kubernetes和CoreOS。

Kubernetes有个出名的监控agent—cAdvisor。在每个kubernetes Node上都会运行cAdvisor,它会收集本机以及容器的监控数据(cpu,memory,filesystem,network,uptime)。在较新的版本中,K8S已经将cAdvisor功能集成到kubelet组件中。每个Node节点可以直接进行web访问。

Heapster是一个收集者,Heapster可以收集Node节点上的cAdvisor数据,将每个Node上的cAdvisor的数据进行汇总,还可以按照kubernetes的资源类型来集合资源,比如Pod、Namespace,可以分别获取它们的CPU、内存、网络和磁盘的metric。默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具(如InfluxDB)。

Kubernetes原生dashboard的监控图表信息来自heapster。在Horizontal Pod Autoscaling中也用到了Heapster,HPA将Heapster作为Resource Metrics API,向其获取metric。

架构图:

框架图:

Heapster首先从apiserver获取集群中所有Node的信息,然后通过这些Node上的kubelet获取有用数据,而kubelet本身的数据则是从cAdvisor得到。所有获取到的数据都被推到Heapster配置的后端存储中,并还支持数据的可视化。现在后端存储 + 可视化的方法,如InfluxDB + grafana。

说明:本实验采用influxdb作为heapster后端

1.2 influxdb、grafana

influxdb和grafana的介绍请见这篇文章

2.部署

2.1 获取最新版(v1.5.2)heapster+influxdb+grafana安装yaml文件

heapster release 页面下载最新版本的 heapster:

wget https://github.com/kubernetes/heapster/archive/v1.5.2.zip
unzip v1.5.2.zip
cd heapster-1.5.2/deploy/kube-config/influxdb

2.2 修改配置文件

我修改好的文件已上传到github

2.2.1 heapster配置

修改heapster.yml配置文件:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heapster
  namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: heapster
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: heapster
    spec:
      serviceAccountName: heapster
      containers:
      - name: heapster
        image: daocloud.io/liukuan73/heapster-amd64:v1.5.2
        imagePullPolicy: IfNotPresent
        command:
        - /heapster
        - --source=kubernetes:https://kubernetes.default
        - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
      nodeSelector:
        node-role.kubernetes.io/master: true
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Heapster
  name: heapster
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 8082
    nodePort: 30003
  selector:
    k8s-app: heapster

部署好后可以打开kubernetes-dashboard看效果,看是否显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图;

2.2.2 influxdb配置

修改influxdb.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-influxdb
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: influxdb
    spec:
      containers:
      - name: influxdb
        image: daocloud.io/liukuan73/heapster-influxdb-amd64:v1.3.3
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: /data
          name: influxdb-storage
      volumes:
      - name: influxdb-storage
        emptyDir: {}
      nodeSelector:
        node-role.kubernetes.io/master: true
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    #kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-influxdb
  name: monitoring-influxdb
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 8086
    targetPort: 8086
    nodePort: 30004
  selector:
    k8s-app: influxdb

2.2.3 grafana配置

修改grafana.yaml配置文件:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: daocloud.io/liukuan73/heapster-grafana:v4.4.3
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/ssl/certs
          name: ca-certificates
          readOnly: true
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        - name: GF_SERVER_HTTP_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          # value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          value: /
      volumes:
      - name: ca-certificates
        hostPath:
          path: /etc/ssl/certs
      - name: grafana-storage
        emptyDir: {}
      nodeSelector:
        node-role.kubernetes.io/master: true
      tolerations:
      - key: "node-role.kubernetes.io/master"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  type: NodePort
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  # type: NodePort
  ports:
  - port: 80
    targetPort: 3000
    nodePort: 30002
  selector:
    k8s-app: grafana

注意

在安装好 Grafana 之后我们使用的是默认的 template 配置,页面上的 namespace 选择里只有 default 和 kube-system,并不是说其他的 namespace 里的指标没有得到监控,只是我们没有在 Grafana 中开启他它们的显示而已。见 Cannot see other namespaces except, kube-system and default #1279

修改grafana模板

将 Templating 中的 namespace 的 Data source 设置为 influxdb-datasource,Refresh 设置为 on Dashboard Load 保存设置,刷新浏览器,即可看到其他 namespace 选项。

配置influxdb数据源 点击“Add data source”配置数据源:

配置dashboard 1.导入“Kubernetes Node Statistics”dashabord import这个模板的序号3646就可以了:

效果:

2.导入“Kubernetes Pod Statistics”dashabord import这个模板的序号3649就可以了:

效果:

3.Heapster使用

这里主要介绍Heapster的API使用,及可获取的Metrics。 The Heapster Model is a structured representation of metrics for Kubernetes clusters, which is exposed through a set of REST API endpoints. It allows the extraction of up to 15 minutes of historical data for any Container, Pod, Node or Namespace in the cluster, as well as the cluster itself (depending on the metric).

API文档及可用Metrics在官方文档中都介绍的比较齐全。下面用几条测试命令来解释API使用:

  • 获取支持的metrics:
curl -L http://10.142.21.21:30003/api/v1/model/metrics
    [
    "cpu/limit",
    "memory/request",
    "memory/limit",
    "cpu/usage_rate",
    "memory/usage",
    "cpu/request"
    ]
  • 列出某一个Node支持的metrics
curl -L http://10.142.21.21:30003/api/v1/model/nodes/k8smaster01/metrics
  • 查看对应Pod的cpu使用率
curl -L http://10.142.21.21:30003/api/v1/model/namespaces/<namespace-name>/pods/<pod-name>/metrics/cpu-usag

参考:

1.https://jimmysong.io/kubernetes-handbook/practice/heapster-addon-installation.html 2.https://www.kubernetes.org.cn/932.html

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券