blackbox_exporter
blackbox_exporter是Prometheus 官方提供的 exporter,可通过HTTP、HTTPS、DNS、TCP、ICMP 对端点进行可用性等指标探测。
blackbox_exporter 官方文档
https://github.com/prometheus/blackbox_exporter
blackbox_exporter 实现以下功能
1、 HTTP GET 探测
2、 TCP 端口探测
3、 ICMP 主机探测
4、HTTP POST 探测
5、 SSL 证书过期
部署blackbox_exporter
1,下载blackbox_exporter
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.22.0/blackbox_exporter-0.22.0.linux-amd64.tar.gz
tar -zvxf node_exporter-1.4.0.linux-amd64.tar.gz -C /usr/local/
cd /usr/local/
mv node_exporter-1.4.0.linux-amd64.tar.gz node_exporter
2,查看blackbox_exporter版本信息
/usr/local/blackbox_exporter
./blackbox_exporter --version
3,systemctl管理blackbox_exporter
vim /usr/lib/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/usr/local/blackbox_exporter/blackbox_exporter --config.file=/usr/local/blackbox_exporter/blackbox.yml
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
[Install]
WantedBy=multi-user.target
4,启动、开机启动blackbox_exporter
systemctl start blackbox_exporter && systemctl enable blackbox_exporter
ps -ef |grep blackbox_exporter
5,http 访问测试(blackbox_exporter默认监听9115端口)
http://192.168.100.167:9115/
6,blackbox_exporter 配置文件
无特殊需求使用默认配置即可
vim /usr/local/blackbox_exporter/blackbox.yml
modules:
http_2xx:
prober: http
http:
method: GET
preferred_ip_protocol: "ip4"
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
grpc:
prober: grpc
grpc:
tls: true
preferred_ip_protocol: "ip4"
grpc_plain:
prober: grpc
grpc:
tls: false
service: "service1"
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
- send: "SSH-2.0-blackbox-ssh-check"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp:
prober: icmp
icmp_ttl5:
prober: icmp
timeout: 5s
icmp:
ttl: 5
Prometheus blackbox_exporter 配置
prometheus.yml中加入blackbox_exporter
vim /usr/local/prometheus/prometheus.yml
1,ICMP监控主机存活状态配置
#icmp ping 监控
- job_name: crawler_status
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['223.5.5.5','114.114.114.114']
labels:
instance: node_status
group: 'icmp-node'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1:9115
2,TCP 端口监控配置
#监控tcp端口
- job_name: tcp_port
metrics_path: /probe
params:
module: [tcp_connect]
file_sd_configs:
- files: ['/usr/local/prometheus/conf.d/tcp_port/*.yml']
refresh_interval: 10s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 127.0.0.1:9115
2.1,tcp 监控targets文件
vim /usr/local/prometheus/conf.d/tcp_port/tcp_port.yml
- targets: ['192.168.100.234:18080','192.168.100.235:22']
labels:
group: 'tcp port'
3,HTTP GET 监控配置
# http get 监控
- job_name: http_get
metrics_path: /probe
params:
module: [http_2xx]
file_sd_configs:
- files: ['/usr/local/prometheus/conf.d/http_get/*.yml']
refresh_interval: 10s
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.100.167:9115
3.1,http_get监控 targets文件
vim /usr/local/prometheus/conf.d/http_get/http_get.yml
- targets:
- http://192.168.100.234:18080/
labels:
name: 'http_get'
- targets:
- https://www.sohu.com/
labels:
name: 'http_get'
4,prometheus.yml blackbox_exporter 配置文件
5,Prometheus restful接口热加载配置
curl -X POST http://127.0.0.1:9090/-/reloa
6,访问 Prometheus Web UI 查看Targets
Prometheus Rule 告警规则
1,创建rule告警目录
mkdir -p /usr/local/prometheus/rules/
chown prometheus.prometheus /usr/local/prometheus/rules/
2,编辑rule配置文件
vim /usr/local/prometheus/rules/rules.yml
groups:
- name: http_status_code
rules:
- alert: probe_http_status_code
expr: probe_http_status_code != 200
for: 1m
labels:
severity: critical
annotations:
summary: "{{ $labels.instance }} 状态码异常"
description: "{{ $labels.instance }} 网站访问异常!!! (value: {{ $value }})"
- name: icmp_ping_status
rules:
- alert: icmp_ping_status
expr: probe_icmp_duration_seconds{phase="rtt"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "主机 {{ $labels.instance }} ICMP异常"
description: "{{ $labels.instance }} ICMP异常!!!(value: {{ $value }})"
value: '{{ $value }}'
##延迟高
- name: link_delay_high
rules:
- alert: link_delay_high
expr: probe_icmp_duration_seconds{phase="rtt"} >0.005
for: 1m
labels:
severity: critical
annotations:
summary: " {{ $labels.instance }} 延迟高!"
description: "{{ $labels.instance }} 延迟高!!!(value: {{ $value }})"
3,检查rule文件格式
/usr/local/prometheus/promtool check rules rules.yml
Checking rules.yml
SUCCESS: 3 rules found
Prometheus 配置rule参数
1,Prometheus 加入rule告警目录
vim /usr/local/prometheus/prometheus.yml
rule_files: ['/usr/local/prometheus/rules/*.yml']
2,Prometheus restful接口热加载配置
curl -X POST http://127.0.0.1:9090/-/reload
3,访问 Prometheus Web UI 查看Rule
Prometheus告警测试
1,停止主机 192.168.100.234 web服务
systemctl stop nginx
2,访问 Prometheus Web UI 查看Alerts