前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >【Z投稿】大规模数据库监控的Zabbix玩法详谈

【Z投稿】大规模数据库监控的Zabbix玩法详谈

作者头像
Zabbix
发布2021-02-03 14:47:12
6280
发布2021-02-03 14:47:12
举报
王伟

Oracle ACE for MySQL

10多年MySQL大规模数据库运维经验+8年Zabbix使用经验。本次峰会演讲和workshop《大规模数据库监控的Zabbix玩法》,讲述海量数据库实例的监控,介绍zabbix的安装、部署、优化,以及数据库自动化运维。

作者的话

从zabbix的1.6版本开始测试,1.8的版本开始线上使用,线上使用过1.9、2.0、2.2、3.0、4.0的版本,使用或是测试过zabbix1.6之后的所有版本。个人也有之前的SA转变为DBA,就zabbix的运维走数据库层面有一些自己的心得,希望对读者有所帮助。

点击获取原文

1

MySQL版本推荐:MySQL5.7及以上版本,便捷的在线DDL方便zabbix的快速升级

链接方式:zabbix的server、proxy、MySQL数据库尽量使用域名方式连接,方便进行故障切换。

2

zabbix数据库的授权:

读写权限,用作zabbix自身访问:

grant all privileges on zabbix.* to 'zabbix'@'1.1.1.1' identified by 'zabbix';

只读权限,用作二次开发只读zabbix数据库:

grant SELECT on zabbix.* to 'zabbix_ro'@'1.1.1.1' identified by 'zabbixro';

3

MySQL配置文件需要调整的几个重要参数:

innodb_log_files_in_group = 16

innodb_log_file_size = 1G

innodb_file_per_table = 1

max_allowed_packet = 64M

back_log = 1024

max-connections = 2000

sync_binlog = 0

innodb_flush_log_at_trx_commit = 0

binlog_format = row

expire_logs_days = 3

4

使用SQL语句更新监控项或是触发器,提升变更效率,减少对数据库影响:

/*

update zabbix.items set delay=5 where key_='icmpping';

update zabbix.items t,zabbix.functions f set f.parameter='30' where f.itemid=t.itemid and t.key_='agent.ping'

*/

5

在做二次开发时,为避免对线上监控server端的影响,尽量使用独立的数据库从库读取数据(可能有大量通过API读取历史数据库等只读操作),需要一个只读的api接口,此时需要做zabbix数据库读写分离,可以在从库上过滤掉几张表,打开从库的写权限,这几张表主要用作用户登录的审计、session的记录,否则可能登录失败:

replicate-wild-ignore-table =zabbix.auditlog

replicate-wild-ignore-table =zabbix.sessions

replicate-wild-ignore-table =zabbix.ids

同时zabbix web_api做负载均衡的时候,请使用IP hash的方式,因为涉及到session的问题。

6

做zabbix的升级,其实是做zabbix数据库库表的变更,可能包含表结构的表更、索引的变更、数据的重组等操作,此时需要做一套新的数据库集群作为线上集群的存库存在,等数据同步完成,断掉同步,使用新的zabbix server程序连接新的数据库集群完成升级,升级完成后切换线上zabbix server、数据库域名到新的server和新的数据库上,这样能最低限度影响正常的监控。

7

数据库备份和数据情能力

备份主要信息即可,监控数据和审计日志、events不需要备份,数据量太大。

历史数据的删除:包含监控历史详情数据和趋势数据,使用分区的方式进行删除;另外events表也要定时的清理,使用主键进行清理,保留5天左右的即可,并需要定期的进行碎片整理,否则监控大屏显示会逐步变慢。

8

zabbix server历史数据和趋势数据分区

https://www.zabbix.org/wiki/Docs/howto/mysql_partition

/* 为历史表分区做准备。zabbix2.0/3.0 only,3.2以上的版本不需要执行 */

ALTER TABLE history_text DROP PRIMARY KEY, ADD INDEX (id), DROP INDEX history_text_2, ADD INDEX history_text_2 (itemid, id);

ALTER TABLE history_log DROP PRIMARY KEY, ADD INDEX (id), DROP INDEX history_log_2, ADD INDEX history_log_2 (itemid, id);

/* 在server的配置文件中设置HousekeepingFrequency=0,禁用zabbix自带的清理历史数据任务 */

/*设置历史监控数据表中的时间字段为索引,方便后续按照时间取值二次开发*/

ALTER TABLE history ADD INDEX clock (clock);

ALTER TABLE history_log ADD INDEX clock (clock);

ALTER TABLE history_str ADD INDEX clock (clock);

ALTER TABLE history_str_sync ADD INDEX clock (clock);

ALTER TABLE history_sync ADD INDEX clock (clock);

ALTER TABLE history_text ADD INDEX clock (clock);

ALTER TABLE history_uint ADD INDEX clock (clock);

ALTER TABLE history_uint_sync ADD INDEX clock (clock);

ALTER TABLE trends ADD INDEX clock (clock);

ALTER TABLE trends_uint ADD INDEX clock (clock);

ALTER TABLE EVENTS ADD INDEX objectid (objectid,VALUE,clock);

/* 如果存在这五个存储过程则删除 */

DROP PROCEDURE IF EXISTS partition_create;

DROP PROCEDURE IF EXISTS partition_drop;

DROP PROCEDURE IF EXISTS partition_maintenance;

DROP PROCEDURE IF EXISTS partition_maintenance_all;

DROP PROCEDURE IF EXISTS partition_verify;

/*第一次执行存储过程*/

CALL partition_maintenance_all('zabbix');

/*

inser to crontab 每天执行一次存储过程,生成新的分区和删除老的分区

#!/bin/sh

/xxxxx/mysql/bin/mysql -uzabbix -pzabbixmonitor --socket=/xxxx/mysql/tmp/mysql.sock --database=zabbix -e "CALL partition_maintenance_all('zabbix');"

*/

存储过程详情如下:

/* partition_create */

DELIMITER $$

CREATE PROCEDURE `partition_create`(SCHEMANAME VARCHAR(64), TABLENAME VARCHAR(64), PARTITIONNAME VARCHAR(64), CLOCK INT)

BEGIN

/*

SCHEMANAME = The DB schema in which to make changes

TABLENAME = The table with partitions to potentially delete

PARTITIONNAME = The name of the partition to create

*/

/*

Verify that the partition does not already exist

*/

DECLARE RETROWS INT;

SELECT COUNT(1) INTO RETROWS

FROM information_schema.partitions

WHERE table_schema = SCHEMANAME AND TABLE_NAME = TABLENAME AND partition_description >= CLOCK;

IF RETROWS = 0 THEN

/*

1. Print a message indicating that a partition was created.

2. Create the SQL to create the partition.

3. Execute the SQL from #2.

*/

SELECT CONCAT( "partition_create(", SCHEMANAME, ",", TABLENAME, ",", PARTITIONNAME, ",", CLOCK, ")" ) AS msg;

SET @SQL = CONCAT( 'ALTER TABLE ', SCHEMANAME, '.', TABLENAME, ' ADD PARTITION (PARTITION ', PARTITIONNAME, ' VALUES LESS THAN (', CLOCK, '));' );

PREPARE STMT FROM @SQL;

EXECUTE STMT;

DEALLOCATE PREPARE STMT;

END IF;

END$$

DELIMITER ;

/* partition_drop */

DELIMITER $$

CREATE PROCEDURE `partition_drop`(SCHEMANAME VARCHAR(64), TABLENAME VARCHAR(64), DELETE_BELOW_PARTITION_DATE BIGINT)

BEGIN

/*

SCHEMANAME = The DB schema in which to make changes

TABLENAME = The table with partitions to potentially delete

DELETE_BELOW_PARTITION_DATE = Delete any partitions with names that are dates older than this one (yyyy-mm-dd)

*/

DECLARE done INT DEFAULT FALSE;

DECLARE drop_part_name VARCHAR(16);

/*

Get a list of all the partitions that are older than the date

in DELETE_BELOW_PARTITION_DATE. All partitions are prefixed with

a "p", so use SUBSTRING TO get rid of that character.

*/

DECLARE myCursor CURSOR FOR

SELECT partition_name

FROM information_schema.partitions

WHERE table_schema = SCHEMANAME AND TABLE_NAME = TABLENAME AND CAST(SUBSTRING(partition_name FROM 2) AS UNSIGNED) < DELETE_BELOW_PARTITION_DATE;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

/*

Create the basics for when we need to drop the partition. Also, create

@drop_partitions to hold a comma-delimited list of all partitions that

should be deleted.

*/

SET @alter_header = CONCAT("ALTER TABLE ", SCHEMANAME, ".", TABLENAME, " DROP PARTITION ");

SET @drop_partitions = "";

/*

Start looping through all the partitions that are too old.

*/

OPEN myCursor;

read_loop: LOOP

FETCH myCursor INTO drop_part_name;

IF done THEN

LEAVE read_loop;

END IF;

SET @drop_partitions = IF(@drop_partitions = "", drop_part_name, CONCAT(@drop_partitions, ",", drop_part_name));

END LOOP;

IF @drop_partitions != "" THEN

/*

1. Build the SQL to drop all the necessary partitions.

2. Run the SQL to drop the partitions.

3. Print out the table partitions that were deleted.

*/

SET @full_sql = CONCAT(@alter_header, @drop_partitions, ";");

PREPARE STMT FROM @full_sql;

EXECUTE STMT;

DEALLOCATE PREPARE STMT;

SELECT CONCAT(SCHEMANAME, ".", TABLENAME) AS `table`, @drop_partitions AS `partitions_deleted`;

ELSE

/*

No partitions are being deleted, so print out "N/A" (Not applicable) to indicate

that no changes were made.

*/

SELECT CONCAT(SCHEMANAME, ".", TABLENAME) AS `table`, "N/A" AS `partitions_deleted`;

END IF;

END$$

DELIMITER ;

/* partition_maintenance */

DELIMITER $$

CREATE PROCEDURE `partition_maintenance`(SCHEMA_NAME VARCHAR(32), TABLE_NAME VARCHAR(32), KEEP_DATA_DAYS INT, HOURLY_INTERVAL INT, CREATE_NEXT_INTERVALS INT)

BEGIN

DECLARE OLDER_THAN_PARTITION_DATE VARCHAR(16);

DECLARE PARTITION_NAME VARCHAR(16);

DECLARE LESS_THAN_TIMESTAMP INT;

DECLARE CUR_TIME INT;

CALL partition_verify(SCHEMA_NAME, TABLE_NAME, HOURLY_INTERVAL);

SET CUR_TIME = UNIX_TIMESTAMP(DATE_FORMAT(NOW(), '%Y-%m-%d 00:00:00'));

SET @__interval = 1;

create_loop: LOOP

IF @__interval > CREATE_NEXT_INTERVALS THEN

LEAVE create_loop;

END IF;

SET LESS_THAN_TIMESTAMP = CUR_TIME + (HOURLY_INTERVAL * @__interval * 3600);

SET PARTITION_NAME = FROM_UNIXTIME(CUR_TIME + HOURLY_INTERVAL * (@__interval - 1) * 3600, 'p%Y%m%d%H00');

CALL partition_create(SCHEMA_NAME, TABLE_NAME, PARTITION_NAME, LESS_THAN_TIMESTAMP);

SET @__interval=@__interval+1;

END LOOP;

SET OLDER_THAN_PARTITION_DATE=DATE_FORMAT(DATE_SUB(NOW(), INTERVAL KEEP_DATA_DAYS DAY), '%Y%m%d0000');

CALL partition_drop(SCHEMA_NAME, TABLE_NAME, OLDER_THAN_PARTITION_DATE);

END$$

DELIMITER ;

/* partition_verify */

DELIMITER $$

CREATE PROCEDURE `partition_verify`(SCHEMANAME VARCHAR(64), TABLENAME VARCHAR(64), HOURLYINTERVAL INT(11))

BEGIN

DECLARE PARTITION_NAME VARCHAR(16);

DECLARE RETROWS INT(11);

DECLARE FUTURE_TIMESTAMP TIMESTAMP;

/*

* Check if any partitions exist for the given SCHEMANAME.TABLENAME.

*/

SELECT COUNT(1) INTO RETROWS

FROM information_schema.partitions

WHERE table_schema = SCHEMANAME AND TABLE_NAME = TABLENAME AND partition_name IS NULL;

/*

* If partitions do not exist, go ahead and partition the table

*/

IF RETROWS = 1 THEN

/*

* Take the current date at 00:00:00 and add HOURLYINTERVAL to it. This is the timestamp below which we will store values.

* We begin partitioning based on the beginning of a day. This is because we don't want to generate a random partition

* that won't necessarily fall in line with the desired partition naming (ie: if the hour interval is 24 hours, we could

* end up creating a partition now named "p201403270600" when all other partitions will be like "p201403280000").

*/

SET FUTURE_TIMESTAMP = TIMESTAMPADD(HOUR, HOURLYINTERVAL, CONCAT(CURDATE(), " ", '00:00:00'));

SET PARTITION_NAME = DATE_FORMAT(CURDATE(), 'p%Y%m%d%H00');

-- Create the partitioning query

SET @__PARTITION_SQL = CONCAT("ALTER TABLE ", SCHEMANAME, ".", TABLENAME, " PARTITION BY RANGE(`clock`)");

SET @__PARTITION_SQL = CONCAT(@__PARTITION_SQL, "(PARTITION ", PARTITION_NAME, " VALUES LESS THAN (", UNIX_TIMESTAMP(FUTURE_TIMESTAMP), "));");

-- Run the partitioning query

PREPARE STMT FROM @__PARTITION_SQL;

EXECUTE STMT;

DEALLOCATE PREPARE STMT;

END IF;

END$$

DELIMITER ;

/* partition_maintenance_all */

DELIMITER $$

CREATE PROCEDURE `partition_maintenance_all`(SCHEMA_NAME VARCHAR(32))

BEGIN

CALL partition_maintenance(SCHEMA_NAME, 'history', 28, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'history_log', 28, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'history_str', 28, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'history_text', 28, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'history_uint', 28, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'trends', 730, 24, 14);

CALL partition_maintenance(SCHEMA_NAME, 'trends_uint', 730, 24, 14);

END$$

DELIMITER ;

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-12-24,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 Zabbix开源社区 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云数据库 SQL Server
腾讯云数据库 SQL Server (TencentDB for SQL Server)是业界最常用的商用数据库之一,对基于 Windows 架构的应用程序具有完美的支持。TencentDB for SQL Server 拥有微软正版授权,可持续为用户提供最新的功能,避免未授权使用软件的风险。具有即开即用、稳定可靠、安全运行、弹性扩缩等特点。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档