前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Prometheus安装部署+监控+绘图+告警

Prometheus安装部署+监控+绘图+告警

作者头像
DevOps云学堂
发布2019-10-18 17:29:38
1.1K0
发布2019-10-18 17:29:38
举报
文章被收录于专栏:DevOps持续集成

安装部分

在官网下载对应的压缩包文件,解压、添加系统服务器、启动。

Node_exporter

安装命令

代码语言:javascript
复制
tar zxf node_exporter-0.17.0.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/node_exporter.service

[Unit]
Description=node_exporter
After=network.target

[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter-0.17.0.linux-amd64/node_exporter

[Install]
WantedBy=multi-user.target


systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporter

验证

AlertManager

安装命令

代码语言:javascript
复制
tar zxf alertmanager-0.17.0.linux-amd64.tar.gz  -C /usr/local
vim /etc/systemd/system/alertmanager.service

[Unit]
Description=Alertmanager
After=network-online.target

[Service]
Restart=on-failure
ExecStart=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager --config.file=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager.yml

[Install]
WantedBy=multi-user.target

systemctl start alertmanager
systemctl status alertmanager
systemctl enable alertmanager

netstat -anlpt | grep 9093

验证

Prometheus

Shell命令

代码语言:javascript
复制
tar zxf prometheus-2.9.2.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/prometheus.service
 
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus-2.9.2.linux-amd64/prometheus --config.file=/usr/local/prometheus-2.9.2.linux-amd64/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.external-url=http://0.0.0.0:9090

[Install]
WantedBy=multi-user.target

验证

Grafana

安装

代码语言:javascript
复制
下载:https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/el7/grafana-5.4.2-1.x86_64.rpm


rpm -ivh grafana-5.4.2-1.x86_64.rpm

systemctl start grafana-server
systemctl status grafana-server
systemctl enable grafana-server


netstat -anlpt | grep 3000

验证

配置部分

AlertManager

配置文件

代码语言:javascript
复制
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.qq.com:465'
  smtp_from: 'xxxxx@qq.com'
  smtp_auth_username: 'xxxx@qq.com'
  smtp_auth_password: 'xxxkbpfmygbecg'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'toemail'
receivers:
- name: 'toemail'
  email_configs:
  - to: 'xxxxx@qq.com'
    send_resolved: true
- name: 'web.hook'
  webhook_configs:
  - url: 'http://127.0.0.1:5001/'
inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'dev', 'instance']

Prometheus

代码语言:javascript
复制
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - localhost:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "rules/host_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']


  - job_name: 'my target'
    static_configs:
    - targets: ['localhost:9100']

验证

查看目标

查看告警配置

查看监控数据(https://grafana.com/dashboards/9276)

告警

模拟node_exporter宕机

systemctl stop node_exporter

查看邮箱收件箱

以上就完成了一个简单的监控告警配置!特别感谢网上的一些文档。参考文档:https://jianshu.com/p/e59cfd15612e

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2019-09-06,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DevOps持续集成 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 安装部分
    • Node_exporter
      • AlertManager
        • Prometheus
          • Grafana
          • 配置部分
            • AlertManager
              • Prometheus
                • 验证
                • 告警
                  • 模拟node_exporter宕机
                    • 查看邮箱收件箱
                    相关产品与服务
                    Grafana 服务
                    Grafana 服务(TencentCloud Managed Service for Grafana,TCMG)是腾讯云基于社区广受欢迎的开源可视化项目 Grafana ,并与 Grafana Lab 合作开发的托管服务。TCMG 为您提供安全、免运维 Grafana 的能力,内建腾讯云多种数据源插件,如 Prometheus 监控服务、容器服务、日志服务 、Graphite 和 InfluxDB 等,最终实现数据的统一可视化。
                    领券
                    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档