前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >[nagios][icinga2]难搞的深信服设备监控

[nagios][icinga2]难搞的深信服设备监控

作者头像
用户9314062
发布2022-05-20 14:27:49
1.1K0
发布2022-05-20 14:27:49
举报
文章被收录于专栏:LINUX开源玩家LINUX开源玩家

前言

公司买了一堆深信服的设备,最近打算纳入自己的监控,本来以为开启snmp检测几个oid就好,结果发现深信服很坑很坑,总结下。现有三种深信服的设备:AC(访问控制),V**(虚拟隧道网络)和FW(防火墙)。

大的问题两个:

1. 通用指标的snmp OID不是统一的,虽然都是深信服的牌子,但是就连uptime这种通用标准的oid都没有统一!?

2. 输出字符编码不统一,同样输出Hex-STRING,有用utf8,有用gbk......

小问题就多了:

输出随意不讲逻辑,比如同样在v**里面,前一条是CPU使用率,输出一个数字(14),后一条是剩余内存,输出字符串 (110 MB),而AC和FW都有数字输出内存使用率;

再比如AC和FW输出连接数是数字(1324),V**输出连接数变成字符串(1174 sessions in all);

输出格式不讲究,比如下面的v**,为什么第二个和第六个要换行?

代码语言:javascript
复制
iso.3.6.1.2.1.1.1.0 = STRING: "Sangfor AF"
iso.3.6.1.2.1.1.1.0 = STRING: "Linux sslvpn 3.10.0 #3 SMP Tue Dec 17 14:24:33 CST 2019 x86_64 x86_64 x86_64 GNU/Linux
"
iso.3.6.1.2.1.1.2.0 = OID: iso.3.6.1.4.1.35047.2.10
iso.3.6.1.2.1.1.3.0 = Timeticks: (1913141400) 221 days, 10:16:54.00
iso.3.6.1.2.1.1.4.0 = STRING: "support@sangfor.com.cn"
iso.3.6.1.2.1.1.5.0 = STRING: "Linux
"
iso.3.6.1.2.1.1.6.0 = STRING: "China"
iso.3.6.1.2.1.1.7.0 = INTEGER: 72

处理过程

原本想直接使用nagios插件自带的check_snmp,再把结果导入granfana生成漂亮图,结果各种错误搞到崩溃,最后强行编了一个自己看着都难受的脚本,凑活着获取几个值就收工。

脚本

脚本如下:

代码语言:javascript
复制
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# huky0924@aliyun.com
# 因为深信服设备而来的痛苦编程

import os
import sys
import getopt
import logging
from subprocess import PIPE, run
import codecs


def returnToIcinga(outStr, status, outPerf):

    out = outStr + ' |' + outPerf
    if status:
        if 'CRITICAL' in status:
            return (out, 2)
        elif 'WARNING' in status:
            return (out, 1)
        elif 'OK' in status:
            return (out, 0)
        else:
            return (out, 3)
    else:
        return (out, 0)


if __name__ == '__main__':

    # 临时自行修改标准输出为utf-8,后面获取的编码有多种编码
    sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())
    logging.info('\n开始')
    logFile = '/tmp/check_snmp_wrapper.log'
    logging.basicConfig(level=logging.INFO,
                        format='%(asctime)s %(levelname)s %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S',
                        filename=logFile,
                        filemode='a')
    
    argv = sys.argv[1:]
    opts, args = getopt.getopt(argv, "H:C:o:",  ["hostname=", "community=", "oid="])  # 长选项模式
    for opt, arg in opts:
        if opt in ['-H', '--hostname']:
            hostname = arg
        elif opt in ['-C', '--community']:
            community = arg
        elif opt in ['-o', '--oid']:
            oid = arg
    
    CMD = ['/usr/bin/snmpwalk', '-Os', '-v2c', '-t3', '-c', community, hostname, oid]
    
    outStr = ''
    outPerf = ''
    status = ''
    
    try:
        logging.info('检测地址: ' + hostname)
        exe = run(CMD, timeout=3, stdout=PIPE, stderr=PIPE)
        if not exe.returncode == 0:
            logging.error(exe.stderr)
            sys.exit(1)
    
        res = exe.stdout.decode('utf-8')
        #logging.info('检测结果: ' + res)
    
        if len(res) > 20:
            output = res.replace('\n',' ')
            logging.info('合并为一行: ' + output)
    
            if 'INTEGER' in output:
                #logging.info('结果为整数值: ' + output)
                result = output.split('INTEGER: ')[-1]
                status = 'OK:'
                outPerf = 'alarm=' + result.strip() + ';'
                outStr = '获取值: ' + result
                #logging.info(outStr, outPerf)
    
            else:
                if 'Hex-STRING' in output:
                    result = output.split('STRING: ')[-1]
                    #logging.warn('结果为十六进制字符: ' + output)
                    try:
                        logStr = bytes.fromhex(result).decode('utf8')
                    except UnicodeDecodeError as e:
                        logStr = bytes.fromhex(result).decode('gbk')        
                    #logging.warn('解码为: ' + logStr)
                else:
                    logStr = output.split('STRING: ')[-1].replace('"', '')
        
                # 日志生成列表可能只有一个元素
                try:
                    outList = logStr.split('|')
                    if len(outList) > 1:
                        outStr = outList[0].strip()
                        outPerf = outList[1].strip()
                        outStrL = outStr.split(':')
                        status = outStrL.pop(0)
                        outStr = ' '.join(outStrL)
                    else:
                        outStr = outList[0].strip()
                except Exception as e:
                    logging.error(e)

        else:
            exit(0)
    
    except Exception as e:
        logging.error(e)

    #logging.info(status, status, outPerf)
    #logging.info(type(outPerf))
    (rev, ren) = returnToIcinga(outStr, status, outPerf)
   
    print(rev)
    sys.exit(ren)

上面的脚本保存为 /usr/lib/nagios/plugins/check_snmp_wrapper.py,并创建命令供icinga调用,以后使用命令snmp_wrapy即可

配置

代码语言:javascript
复制
#snmp warpper python
object CheckCommand "snmp_wrapy" {
  command = [ PluginDir + "/check_snmp_wrapper.py" ]
  arguments = {
    "-H" = "$address$"
    "-C" = "$snmp_community$"
    "-o" = "$snmpoid$"
  }
}

定义主机设备

注意为了识别和区分深信服的AC /V** /FW,自定义了一个主机变量vars.manufacturer并赋值为"sangfor",同样方法可以识别区分Huawei(华为)H3C(华三)Cisco(思科等),

为了进步以区分,在命名的时候使用AC/V**/FW开头,后面创建服务的时候可以执行相应的匹配,如:

定义中的vars.client_endpoint是因为设置了卫星服务器来分担主服务器的负载,不是必须的。

代码语言:javascript
复制
object Host "AC-XXXGS" {
  import "generic-switch"
  display_name = "AC-XXX公司"
  address = "192.168.10.66"
  vars.type = "switch"
  vars.manufacturer = "sangfor"
  vars.client_endpoint = "yyyyy"
  vars.snmp_community = "public"
  vars.snmp_version = "2c"
  icon_image = "img/icons/sangfor.png"
}

object Host "fwXXX" {
  import "generic-switch"
  display_name = "XXX防火墙"
  address = "192.168.10.200"
  vars.type = "switch"
  vars.manufacturer = "sangfor"
  vars.client_endpoint = "yyyyy"
  vars.snmp_community = "public"
  vars.snmp_version = "2c"
  icon_image = "img/icons/sangfor.png"
}

object Host "vpnXXX" {
  import "generic-switch"
  display_name = "XXXvpn"
  address = "192.168.10.100"
  vars.type = "switch"
  vars.manufacturer = "sangfor"
  vars.client_endpoint = "yyyyy"
  vars.snmp_community = "public"
  vars.snmp_version = "2c"
  icon_image = "img/icons/sangfor.png"
}

定义服务

其中最重要的是assign匹配,根据上面的主机定义,按与运算匹配三个条件(client_endpoint,manufacturer,主机名开头字符),如下:

代码语言:javascript
复制
apply Service "memory" {
  display_name = "内存使用率-snmp"
  import "generic-service-sw"
  check_command = "snmp_wrapy"
  vars.check_command = "memory"
  vars.snmpoid = "iso.3.6.1.2.1.1.12"
  assign where (host.vars.client_endpoint == "yyyy" && host.vars.manufacturer == "sangfor" && match("fw*", host.name))
}

apply Service "memory" {
  display_name = "剩余内存-snmp"
  import "generic-service-sw"
  check_command = "snmp_wrapy"
  vars.grafana_graph_disable = 1
  vars.snmpoid = "iso.3.6.1.4.1.35047.1.4.0"
  assign where (host.vars.client_endpoint == "yyyy" && host.vars.manufacturer == "sangfor" && match("vpn*", host.name))
}

apply Service "users" {
  display_name = "用户数-snmp"
  import "generic-service-sw"
  check_command = "snmp_wrapy"
  vars.check_command = "users"
  vars.snmpoid = ".1.3.6.1.4.1.35047.2.1.1.1"
  assign where (host.vars.client_endpoint == "yyyy" && host.vars.manufacturer == "sangfor" && match("AC*", host.name))
} 

重载icinga2

代码语言:javascript
复制
$ sudo /etc/init.d/icinga2 reload
[ ok ] Reloading icinga2 configuration (via systemctl): icinga2.service.

最后还是有少部分指标可以绘图的

结束

顺便说下,华为或者华三可以直接使用centreon-plugins检测,思科等国外品牌通常都可以,直接查看是否支持即可。

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2022-01-25,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 LINUX开源玩家 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档