参考:https://github.com/cmboss/supervisor-event-to-dingtalk-alert
生产环境,有些服务是通过supervisor来管理的,但是默认情况下是不带进程异常的告警通知的,好在有第三方的扩展可以实现这种功能。
这里以对接到钉钉告警为例,其它的webhook方式都大同小异。
$ cat /etc/supervisord.conf 添加下面的内容
[eventlistener:all_alert_to_ding]
command=sh -c /usr/local/bin/supervisor_alert_to_ding.sh
events=PROCESS_STATE_EXITED,PROCESS_STATE_FATAL,PROCESS_STATE_BACKOFF
user=root
autostart=true
stderr_logfile=/tmp/supervisor_to_ding_error.log
stdout_logfile=/tmp/supervisor_to_ding_app.log
这里指定了对 PROCESS_STATE_EXITED,PROCESS_STATE_FATAL,PROCESS_STATE_BACKOFF 这几种类型的异常进行告警。
# 先创建存放日志的目录
mkdir -pv /data/logs/supervisor
cat /usr/local/bin/supervisor_alert_to_ding.sh 内容如下:
#!/bin/bash
source /etc/profile
LOG_DIR=/data/logs/supervisor
LOG_FILENAME=supervisor_all_alert_to_ding.log
HTTP_HEADER="Content-Type: application/json"
# 这里改为自己的钉钉机器人地址(我这里在添加机器人的时候用了关键字 supervisor,因此下面的消息体里面也必须带上这个关键字)
DING_HOOK_URL='https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxx'
test -d ${LOG_DIR} || mkdir -p ${LOG_DIR}
echo "READY"
while read line ; do
echo ${line} >> /tmp/bcd.log
ALERT_TIME=$(date '+%Y-%m-%d %H:%M:%S')
echo -e "\033[31m[ ${ALERT_TIME} -- HEAD: ${line} ]\033[0m" >> ${LOG_DIR}/${LOG_FILENAME}
body_length=$(echo $line | awk -F'len:' '{print $2}' | awk '{print $1}')
read -n ${body_length} body_line
echo -e "\033[31m[ ${ALERT_TIME} -- BODY: ${body_line} ]\033[0m" >> ${LOG_DIR}/${LOG_FILENAME}
alert_info="${line} ${body_line}"
echo "RESULT 2"
echo "OKREADY"
alert_name=$(echo $alert_info | awk -F'processname:' '{print $2}' | awk '{print $1}')
echo ${alert_info} >> /tmp/abc.log
alert_type=$(echo $alert_info | awk -F'eventname:' '{print $2}' | awk '{print $1}')
alert_summary=$(echo $alert_info | awk -F'expected:' '{print $2}' | awk '{print $1}')
alert_host=$(hostname)
alert_details=${body_line}
alert_summary="${ALERT_TIME} 服务 ${alert_name} 异常"
# 为了防止告警淹没,这里用了at群全员
curl -s ${DING_HOOK_URL} -H "${HTTP_HEADER}" -d "{ \"msgtype\": \"markdown\", \"markdown\": { \"title\": \"supervisor alert\", \"text\": '### 告警的supervisor服务: ${alert_name}\n ### 告警类型: ${alert_type}\n ### 主机: ${alert_host}\n ### 概要:\n ${alert_summary}' },\"at\": { \"isAtAll\": true } }" >> ${LOG_DIR}/${LOG_FILENAME}
[[ $? == 0 ]] && echo " ---> dingding Hook Send success..." >> ${LOG_DIR}/${LOG_FILENAME}
done
# 加权限
chmod +x /usr/local/bin/supervisor_alert_to_ding.sh
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。