wget https://raw.githubusercontent.com/zq2599/blog_demos/master/prometheusdemo/files/prometheus.yml && \
wget https://raw.githubusercontent.com/zq2599/blog_demos/master/prometheusdemo/files/docker-compose.yml && \
docker-compose up -d
wget https://raw.githubusercontent.com/zq2599/blog_demos/master/prometheusdemo/files/import_dashboard.sh && \
chmod a+x import_dashboard.sh && \
./import_dashboard.sh 192.168.1.101 xxxxxx
接下来逐个分析;
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['127.0.0.1:9090','node-exporterhost:9100','cadvisorhost:8080']
- job_name: 'proemtheusdemo'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
metrics_path: '/prometheus'
static_configs:
- targets: ['prometheusdemohost:8080']
这是个普通的prometheus配置文件,除了监控自身的9090端口,还有node-exporterhost、cadvisorhost、prometheusdemohost这三个host的不同端口,分别对应着宿主机自身、docker服务、业务web服务等三个监控数据源;
docker-compose.yml里面记录了所有的容器的设置和依赖关系:
version: '2'
services:
node-exporter:
image: prom/node-exporter:v0.17.0-rc.0
container_name: node-exporter
restart: unless-stopped
ports:
- '9100:9100'
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/)'
- '--collector.textfile.directory=/node_exporter/prom'
volumes:
- /proc:/host/proc
- /sys:/host/sys
- /:/rootfs
- ./etc/node_exporter/prom:/node_exporter/prom
cadvisor:
image: google/cadvisor:v0.28.0
container_name: cadvisor
depends_on:
- node-exporter
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
ports:
- "8080:8080"
restart: unless-stopped
prometheusdemo:
image: bolingcavalry/prometheusdemo:0.0.1-SNAPSHOT
container_name: prometheusdemo
ports:
- "8081:8080"
restart: unless-stopped
prometheus:
image: prom/prometheus:v2.8.0-rc.0
container_name: prometheus
depends_on:
- node-exporter
links:
- node-exporter:node-exporterhost
- cadvisor:cadvisorhost
- prometheusdemo:prometheusdemohost
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
restart: unless-stopped
grafana:
image: grafana/grafana:5.4.2
container_name: grafana
links:
- prometheus
environment:
- GF_SERVER_ROOT_URL=http://grafana.server.name
- GF_SECURITY_ADMIN_PASSWORD=secret
- GF_USERS_ALLOW_SIGN_UP=false
depends_on:
- prometheus
ports:
- "3000:3000"
restart: unless-stopped
关于docker-compose.yml文件,有以下几点需要注意: a. 为了采集到宿主机的数据,node-exporter、cadvisor这两个容器通过数据卷参数将宿主机的目录映射到容器中,这在生产环境是要严格控制的,不要将重要的目录轻易暴露给未经校验的容器,例如一个恶意的镜像被pull到本地,然后通过docker tag命令把名称改成了node-exporter、cadvisor; b. prometheus容器的配置中使用了link参数,这样就能用node-exporterhost这样的名称直接访问到node-export容器了; c. prometheus容器通过数据卷映射参数,将宿主机的prometheus.yml映射到容器中,这样我们只要配置好当前目录下的prometheus.yml文件,就可以直接在prometheus容器生效了(如果容器已经启动后再次修改了此文件,要使用docker restart prometheus命令重启容器才能生效); d. prometheusdemo是基于springboot开发的一个web服务,对外提供一个接口,通过ports参数将容器的8080和宿主机的8081端口映射; e. grafana容器的环境变量GF_SECURITY_ADMIN_PASSWORD=secret,表示Grafana的web网页用admin账号登录时,密码是secret;
看过了docker-compose.yml文件,您对整个环境的容器信息已经清楚了,接下来看看import_dashboard.sh这个脚本做了什么;
import_dashboard.sh的内容如下,其实就是用curl命令向Grafana服务器发送http请求,关键位置已加了中文注释,就不多赘述了:
#!/bin/bash
#第一个参数作为Grafana服务器的IP地址
GRAFANA_HOST=$1
#第二个参数作为身份鉴权的API Key
API_KEY=$2
echo "grafana host ["${GRAFANA_HOST}"]"
echo "api key ["${API_KEY}"]"
echo "start create datasource"
#通过curl工具发起一个POST请求,用来创建数据源
curl -X POST \
http://${GRAFANA_HOST}:3000/api/datasources \
-H "Content-Type:application/json" \
-H "Authorization: Bearer ${API_KEY}" \
-d '{"name":"Prometheus","type":"prometheus","url":"http://prometheus:9090","access":"proxy","basicAuth":false}' \
echo ""
echo "start create host dashboard"
#通过curl工具发起一个POST请求,用来创建一个dashboard,也就是前文中我们看到的反映宿主机CPU、磁盘等基本状况的监控页面
curl -X POST \
http://${GRAFANA_HOST}:3000/api/dashboards/db \
-H 'Accept: application/json' \
-H "Authorization: Bearer ${API_KEY}" \
-H 'Content-Type: application/json' \
-H 'Postman-Token: 2d3c3d60-4c5a-4936-836f-1572d447f473' \
-H 'cache-control: no-cache' \
-d '{
"dashboard": {
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
...
...
...