前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Mysql MHA 架构介绍及实践

Mysql MHA 架构介绍及实践

作者头像
mikelLam
发布2022-10-31 14:48:13
1.1K0
发布2022-10-31 14:48:13
举报
文章被收录于专栏:Kubernetes 与 Devops 干货分享

MHA 简介

MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。

在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致 性,以达到真正意义上的高可用。

组件介绍

MHA是构建MySQL高可用架构的一种选择方案

该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。

MHA Manager

MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。

MHA Node

MHA Node 部署在所有运行MySQL的服务器上,无论是master还是slave。主要作用有三个。

  • 保存二进制日志 如果能够访问故障master,会拷贝master的二进制日志
  • 应用差异中继日志 从拥有最新数据的slave上生成差异中继日志,然后应用差异日志。
  • 清除中继日志 在不停止SQL线程的情况下删除中继日志

为了尽可能的减少主库硬件损坏宕机造成的数据丢失,因此在配置MHA的同时建议配置成MySQL半同步复制

故障转移过程

在MHA自动故障切换过程中,MHA试图从宕机的主服务器上最大限度的保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。

MHA工作原理

当master出现故障时,通过对比slave之间I/O线程读取masterbinlog的位置,选取最接近的slave做为latestslave。 其它slave通过与latest slave对比生成差异中继日志。在latest slave上应用从master保存的binlog,同时将latest slave提升为master。最后在其它slave上应用相应的差异中继日志并开始从新的master开始复制。

(1)从宕机崩溃的master保存二进制日志事件(binlog events); (2)识别最新更新的slave; (3)应用差异的中继日志(relay log) 到其他slave; (4)应用从master保存的二进制日志事件(binlog events); (5)提升一个slave为新master; (6)使用其他的slave连接新的master进行复制。

使用MySQL 5.5开始找支持的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。

MHA 工具介绍

MHA软件由两部分组成,Manager工具包和Node工具包,具体的说明如下:

Manager工具包主要包括以下几个工具

  • masterha_check_ssh 检查MHA的SSH配置状况
  • masterha_check_repl 检查MySQL复制状况
  • masterha_manger 启动MHA
  • masterha_check_status 检测当前MHA运行状态
  • masterha_master_monitor 检测master是否宕机
  • masterha_master_switch 控制故障转移(自动或者手动)
  • masterha_conf_host 添加或删除配置的server信息

Node工具包包括以下几个该工具

  • save_binary_logs 保存和复制master的二进制日志
  • apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
  • filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
  • purge_relay_logs 清除中继日志(不会阻塞SQL线程)

另有如下几个脚本需自定义:

  1. master_ip_failover:管理VIP
  2. master_ip_online_change:
  3. masterha_secondary_check:当MHA manager检测到master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。
  4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。

MHA架构

目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器 ,一主二从,即一台master,一台充当备用master,另外一台充当从库,因为至少需要三台服务器。

https://tc.ctq6.cn/tc/20220420164046.png
https://tc.ctq6.cn/tc/20220420164046.png

MHA 实践

角色规划

角色

ip地址

主机名

server_id

类型

Master

172.28.81.2

mha01

2

写入

Candicate master

172.28.81.3

mha02

3

Slave/Manager

172.28.81.4

mha03

4

基础环境配置

关闭防火请和selinux
代码语言:shell
复制
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
systemctl disable firewalld
systemctl stop firewalld
配置源和epel源
代码语言:shell
复制
# 配置阿里云镜像源
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
# 配置epel源
cat<<'EOF'>/etc/yum.repos.d/epel.repo 
[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
baseurl=http://mirrors.aliyun.com/epel/7/$basearch
failovermethod=priority
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7

[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
baseurl=http://mirrors.aliyun.com/epel/7/$basearch/debug
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=0

[epel-source]
name=Extra Packages for Enterprise Linux 7 - $basearch - Source
baseurl=http://mirrors.aliyun.com/epel/7/SRPMS
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=0
EOF
# 安装epel源
yum install -y epel-release
配置时间同步
代码语言:shell
复制
yum install ntpdate -y
ntpdate ntp4.aliyun.com
echo "*/5 * * * * /usr/sbin/ntpdate ntp4.aliyun.com 1>/dev/null 2>&1">>/var/spool/cron/root

Mysql 实例安装(master,slave,slave)

三台机器全部安装Mysql实例,版本为5.7最新版本,一主两从的架构

下载
代码语言:shell
复制
# 安装
wget https://cdn.mysql.com//Downloads/MySQL-5.7/mysql-5.7.37-linux-glibc2.12-x86_64.tar.gz -P /usr/local/src
tar zxf /usr/local/src/mysql-5.7.37-linux-glibc2.12-x86_64.tar.gz -C /usr/local/src
mv /usr/local/src/mysql-5.7.37-linux-glibc2.12-x86_64 /usr/local/mysql
# 配置环境变量
echo "export PATH=$PATH:/usr/local/mysql/bin" >> /etc/profile
source /etc/profile
# 安装依赖
yum install -y libaio
创建mysql系统用户
代码语言:shell
复制
groupadd mysql
useradd -r -g mysql -s /bin/false mysql
创建数据目录
代码语言:text
复制
mkdir -p /data/mysql3306/{mysql,binlog,slowlog,tmp,log,run,conf}
chown -R mysql:mysql /data/mysql3306
chown -R mysql:mysql /usr/local/mysql
创建配置文件

自定义配置文件,mysql配置文件如下所示(/data/mysql3306/conf/my.cnf),需要注意的是两个库一定要限制为只读,因为在后面MHA随时会将从库提升为主库:

代码语言:text
复制
[client]
port            = 3306
socket          =/data/mysql3306/run/mysql.sock

[mysql]
prompt="\u@node01\R:\m:\s [\d]> "   #  XXXX主机名
no-auto-rehash
socket=/data/mysql3306/run/mysql.sock

[mysqld]
####: for global
user                                =mysql                          #	mysql
basedir                             =/usr/local/mysql/              #	/usr/local/mysql/
datadir                             =/data/mysql3306/mysql    #	/usr/local/mysql/data
server_id                           =1                       #	0
socket                              =/data/mysql3306/run/mysql.sock
pid-file                            =/data/mysql3306/run/mysq.pid
port                                =3306                          #	3375
character_set_server                =utf8mb4                           #	latin1
explicit_defaults_for_timestamp     =off                            #    off
log_timestamps                      =system                         #	utc
#read_only                           =0                              #	off
skip_name_resolve                   =1                              #   0
max_allowed_packet = 32M
lower_case_table_names              =1                              #	0
secure_file_priv                    =                               #	null
open_files_limit                    =65536                          #   1024
thread_stack = 512K
external-locking = FALSE
max_allowed_packet = 32M
sort_buffer_size = 4M
join_buffer_size = 4M
thread_cache_size = 768
max_connections                     =2000                           #   151
thread_cache_size                   =64                             #   9
table_open_cache                    =81920                          #   2000
table_definition_cache              =4096                           #   1400
table_open_cache_instances          =64                             #   16
max_prepared_stmt_count             =1048576                        #

####: for binlog
binlog_format                       =row                          #	row
log-bin                             =/data/mysql3306/binlog/mysql-bin
log-bin-index                       =/data/mysql3306/binlog/mysql-bin.index                      #	off
binlog_rows_query_log_events        =on                             #	off
log_slave_updates                   =on                             #	off
expire_logs_days                    =7                              #	0
binlog_cache_size                   =65536                          #	65536(64k)
#binlog_checksum                    =none                           #	CRC32
sync_binlog                         =0                              #	1
slave-preserve-commit-order         =ON                             #

####: for relaylog
relay_log = /data/mysql3306/binlog/mysql-relay-bin


####: for error-log
log_error                           =/data/mysql3306/log/error.log                        #	/usr/local/mysql/data/localhost.localdomain.err

general_log                         =off                            #   off
general_log_file                    =/data/mysql3306/log/general.log                    #   hostname.log

####: for slow query log
slow_query_log                      =on                             #    off
slow_query_log_file                 =/data/mysql3306/slowlog/slow.log                       #    hostname.log
log_queries_not_using_indexes       =on                             #    off
long_query_time                     =5                       #    10.000000
log_throttle_queries_not_using_indexes = 60
min_examined_row_limit = 100
log_slow_admin_statements = 1
log_slow_slave_statements = 1
####: for gtid
gtid_mode                           =on                            #	off
enforce_gtid_consistency            =on                            #	off


####: for replication
skip_slave_start                    =1                              #
master_info_repository              =table                         #	file
relay_log_info_repository           =table                         #	file
#relay_log_recovery                 =1
slave_parallel_type                 =logical_clock                 #    database | LOGICAL_CLOCK
slave_parallel_workers              =4                             #    0
rpl_semi_sync_master_enabled       =1                             #    0
rpl_semi_sync_slave_enabled        =1                             #    0
rpl_semi_sync_master_timeout       =1000                          #    1000(1 second)
plugin_load_add                    =semisync_master.so            #
plugin_load_add                    =semisync_slave.so             #
#binlog_group_commit_sync_delay     =100                           #    500(0.05%秒)、默认值0
#binlog_group_commit_sync_no_delay_count = 10                       #    0

replicate_wild_ignore_table=mysql.%
replicate_wild_ignore_table=test.%
replicate_wild_ignore_table=information_schema.%

explicit_defaults_for_timestamp = 1
innodb_thread_concurrency = 0
innodb_sync_spin_loops = 100
innodb_spin_wait_delay = 30

transaction_isolation = REPEATABLE-READ
#innodb_additional_mem_pool_size = 16M
innodb_buffer_pool_size = 2000M
innodb_buffer_pool_instances = 4
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_data_file_path = ibdata1:1G:autoextend
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 32M
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
innodb_max_undo_log_size = 4G
innodb_undo_directory = /data/mysql3306/undolog
innodb_undo_tablespaces = 95

# 根据您的服务器IOPS能力适当调整
# 一般配普通SSD盘的话,可以调整到 10000 - 20000
# 配置高端PCIe SSD卡的话,则可以调整的更高,比如 50000 - 80000
innodb_io_capacity = 4000
innodb_io_capacity_max = 8000
innodb_flush_neighbors = 0
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_purge_threads = 4
innodb_page_cleaners = 4
innodb_open_files = 65535
innodb_max_dirty_pages_pct = 50
innodb_flush_method = O_DIRECT
innodb_lru_scan_depth = 4000
innodb_checksum_algorithm = crc32
innodb_lock_wait_timeout = 10
innodb_rollback_on_timeout = 1
innodb_print_all_deadlocks = 1
innodb_file_per_table = 1
innodb_online_alter_log_max_size = 4G
internal_tmp_disk_storage_engine = InnoDB
innodb_stats_on_metadata = 0

# some var for MySQL 5.7
innodb_checksums = 1
#innodb_file_format = Barracuda
#innodb_file_format_max = Barracuda
query_cache_size = 0
query_cache_type = 0
innodb_undo_logs = 128

innodb_status_file = 1
# 注意: 开启 innodb_status_output & innodb_status_output_locks 后, 可能会导致log-error文件增长较快
innodb_status_output = 0
innodb_status_output_locks = 0
key_buffer_size = 32M
read_buffer_size = 8M
read_rnd_buffer_size = 4M
bulk_insert_buffer_size = 64M
lock_wait_timeout = 3600

#performance_schema
performance_schema = 1
performance_schema_instrument = '%=on'

#innodb monitor
innodb_monitor_enable="module_innodb"
innodb_monitor_enable="module_server"
innodb_monitor_enable="module_dml"
innodb_monitor_enable="module_ddl"
innodb_monitor_enable="module_trx"
innodb_monitor_enable="module_os"
innodb_monitor_enable="module_purge"
innodb_monitor_enable="module_log"
innodb_monitor_enable="module_lock"
innodb_monitor_enable="module_buffer"
innodb_monitor_enable="module_index"
innodb_monitor_enable="module_ibuf_system"
innodb_monitor_enable="module_buffer_page"
innodb_monitor_enable="module_adaptive_hash"

[mysqldump]
quick
max_allowed_packet = 32M
配置自动启动system文件

创建systemd自动启动文件,方便维护,自启动配置文件如下(mysqld.service),创建完成后,放到/usr/lib/systemd/system/目录下:

代码语言:shell
复制
# Copyright (c) 2015, 2016, Oracle and/or its affiliates. All rights reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
#
# systemd service file for MySQL forking server
#

[Unit]
Description=MySQL Server
Documentation=man:mysqld(7)
Documentation=http://dev.mysql.com/doc/refman/en/using-systemd.html
After=network.target
After=syslog.target

[Install]
WantedBy=multi-user.target

[Service]
User=mysql
Group=mysql

Type=forking
#Type=sample

#PIDFile=/data/mysql3306/run/mysqld.pid

# Disable service start and stop timeout logic of systemd for mysqld service.
TimeoutSec=0

# Execute pre and post scripts as root
PermissionsStartOnly=true

# Needed to create system tables
#ExecStartPre=/usr/bin/mysqld_pre_systemd

# Start main service
#ExecStart=/usr/local/mysql/bin/mysqld --daemonize --pid-file=/data/mysql3306/run/mysqld.pid $MYSQLD_OPTS
ExecStart=/usr/local/mysql/bin/mysqld --defaults-file=/data/mysql3306/conf/my.cnf --daemonize

# Use this to switch malloc implementation
EnvironmentFile=-/etc/sysconfig/mysql

# Sets open_files_limit
LimitNOFILE = 65535

Restart=on-failure

RestartPreventExitStatus=1

PrivateTmp=false
初始化mysql
代码语言:shell
复制
# 初始化
mysqld --defaults-file=/data/mysql3306/conf/my.cnf --initialize --user=mysql --basedir=/usr/local/mysql --datadir=/data/mysql3306/mysql
# 配置ssl
mysql_ssl_rsa_setup --basedir=/usr/local/mysql --datadir=/data/mysql3306/mysql
启动Mysql实例
代码语言:shell
复制
systemctl enable mysqld
systemctl start mysqld
# 配置密码
mysqladmin -S /data/mysql3306/run/mysql.sock -uroot password 'o6vTAkyDLqAyVyuA3W9J' -p'r*tig_;fu5%A'
# 修改root远程登陆
mysql> update mysql.user set Host='172.28.81.%' where User='root';
mysql> flush privileges;

Mysql 复制配置(master,slave,slave)

三个库上创建复制用户,mha管理用户
代码语言:shell
复制
mysql -h172.28.81.2 -uroot -p'o6vTAkyDLqAyVyuA3W9J'
create user 'repl'@'172.28.81.%' identified by 'NlfG53idFgykMrlw1w4d';
create user 'mha'@'172.28.81.%' identified by 'NlfG53idFgykMrlw1w4d';
grant replication slave,replication client,super on *.* to 'repl'@'172.28.81.%';
grant all privileges on *.* to 'mha'@'172.28.81.%';
flush privileges;
主库导出数据(新搭建集群可忽略)
代码语言:shell
复制
mysqldump --single-transaction -uroot -proot123 --opt --master-data=2 --flush-logs --events --triggers --routines -A > all.sql
从库导入数据
代码语言:shell
复制
# 清空gtid_executed
reset master
# 导入数据
mysql -uroot -p < all.sql
在master查看状态
代码语言:shell
复制
show master status
https://tc.ctq6.cn/tc/20220423085935.png
https://tc.ctq6.cn/tc/20220423085935.png
两个从库配置只读
代码语言:shell
复制
set global read_only=1;
两个从库配置主从
代码语言:shell
复制
# 配置主从
CHANGE MASTER TO
MASTER_HOST='172.28.81.2',
MASTER_USER='repl',
MASTER_PASSWORD='NlfG53idFgykMrlw1w4d',
MASTER_PORT=3306,
MASTER_AUTO_POSITION = 1;
# 开启主从
start slave
检查主库从库主从状态
代码语言:shell
复制
show slave status;
https://tc.ctq6.cn/tc/20220423090907.png
https://tc.ctq6.cn/tc/20220423090907.png
测试

主库上创建数据s

代码语言:shell
复制
create database tests DEFAULT CHARACTER SET utf8mb4;

从库上查看,发现数据已经同步过来,再次查看主从状态,发现已经同步更新

https://tc.ctq6.cn/tc/20220423092001.png
https://tc.ctq6.cn/tc/20220423092001.png
https://tc.ctq6.cn/tc/20220423091541.png
https://tc.ctq6.cn/tc/20220423091541.png

到这里,整个Mysql一主两从高可用集群架构已经搭建完毕,剩下即为MHA搭建配置

部署 MHA软件

配置免密登陆
代码语言:shell
复制
ssh-keygen -t rsa  -f /root/.ssh/id_rsa -P ''
ssh-copy-id 172.28.81.2 
ssh-copy-id 172.28.81.3
ssh-copy-id 172.28.81.4

scp -r  /root/.ssh/ 172.28.81.3:/root/
scp -r  /root/.ssh/ 172.28.81.4:/root/
安装依赖(三个节点)
代码语言:shell
复制
yum install perl-DBD-MySQL -y
yum install -y perl-Module-Install
安装MHA Node(三个节点)

1、下载安装包

代码语言:shell
复制
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz -P /usr/local/src

2、编译安装

代码语言:shell
复制
tar zxvf /usr/local/src/mha4mysql-node-0.58.tar.gz -C /usr/local/
cd /usr/local/mha4mysql-node-0.58
perl Makefile.PL
make && make install
安装MHA Manager(仅在Manger管理结点安装)

1、下载安装包

代码语言:shell
复制
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz -P /usr/local/src

2、编译安装

代码语言:shell
复制
tar zxvf /usr/local/src/mha4mysql-manager-0.58.tar.gz -C /usr/local/
cd /usr/local/mha4mysql-manager-0.58
perl Makefile.PL
make && make install
yum install mailx -y
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager  perl-Time-HiRes perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker -y
配置MHA
创建目录
代码语言:shell
复制
mkdir -pv /etc/mha/scripts
配置主配置文件

在/etc/mha目录下创建主配置文件(/etc/mha/app1.conf)

代码语言:shell
复制
[server default]
manager_workdir=/var/log/mha/app1
manager_log=/var/log/mha/app1/manager.log
# mha管理用户
user=mha
# mha管理密码
password=NlfG53idFgykMrlw1w4d 
#基于ssh的密钥认证 
ssh_user=root         
#数据库用户名
repl_user=repl
#数据库密码
repl_password=NlfG53idFgykMrlw1w4d
# ping间隔时长
ping_interval=1
master_binlog_dir=/data/mysql3306/binlog
secondary_check_script=masterha_secondary_check -s 172.28.81.2 -s 172.28.81.3 -s 172.28.81.4
# 自动切换时vip管理的脚本
master_ip_failover_script="/etc/mha/scripts/master_ip_failover"
# 在线切换时vip的管理脚本
master_ip_online_change_script="/etc/mha/scripts/master_ip_online_change"
# 因故障切换后发送报警的脚本
report_script="/etc/mha/scripts/send_report"

[server1]
hostname=172.28.81.2
# master候选节点
candidate_master=1
master_binlog_dir="/data/mysql3306/binlog"
#查看方式 find / -name mysql-bin*

[server2]
hostname=172.28.81.3
candidate_master=1
master_binlog_dir="/data/mysql3306/binlog"

[server3]
hostname=172.28.81.4
master_binlog_dir="/data/mysql3306/binlog"
#表示没有机会成为master
no_master=1
薪增自动切换脚本

在/etc/mha/scripts目录下创建自动切换脚本(/etc/mha/scripts/master_ip_failover)

代码语言:perl
复制
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;

my (
$command,   $ssh_user,  $orig_master_host,
$orig_master_ip,$orig_master_port, $new_master_host, $new_master_ip,$new_master_port
);

######## 定义VIP变量 , 需要修改其中的vip 和 网口名称  ########
my $vip = '172.28.81.100/20';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

GetOptions(
'command=s'     => \$command,
'ssh_user=s'        => \$ssh_user,
'orig_master_host=s'    => \$orig_master_host,
'orig_master_ip=s'  => \$orig_master_ip,
'orig_master_port=i'    => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s'   => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
    my $exit_code = 1;
    eval {
        print "Disabling the VIP on old master: $orig_master_host \n";
        &stop_vip();
        $exit_code = 0;
    };
    if ($@) {
        warn "Got Error: $@\n";
        exit $exit_code;
    }
    exit $exit_code;
}

elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
    print "Enabling the VIP - $vip on the new master - $new_master_host \n";
    &start_vip();
    $exit_code = 0;
};

if ($@) {
    warn $@;
    exit $exit_code;
    }
exit $exit_code;
}

elsif ( $command eq "status" ) {
    print "Checking the Status of the script.. OK \n";
    exit 0;
}
else {
    &usage();
    exit 1;
}
}

sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
薪增邮件报警脚本

在/etc/mha/scripts目录下创建邮件报警脚本(/etc/mha/scripts/send_report)

代码语言:shell
复制
# mail邮件发送程序,需要先配置好发送这信息
vim /etc/mail.rc
set from=plyx_46204@126.com
set smtp=smtps://smtp.126.com:465
set smtp-auth-user=plyx_46204@126.com
set smtp-auth-password=xxxxxxx
set smtp-auth=login
set nss-config-dir=/etc/pki/nssdb
set ssl-verify=ignore

# 这是具体的邮件发送脚本
vim /etc/mha/scripts/send_report
#!/bin/bash
source /root/.bash_profile
# 解析变量
orig_master_host=`echo "$1" | awk -F = '{print $2}'`
new_master_host=`echo "$2" | awk -F = '{print $2}'`
new_slave_hosts=`echo "$3" | awk -F = '{print $2}'`
subject=`echo "$4" | awk -F = '{print $2}'`
body=`echo "$5" | awk -F = '{print $2}'`
########定义收件人地址########
email="plyx_46204@126.com"

tac /var/log/mha/app1/manager.log | sed -n 2p | grep 'successfully' > /dev/null
if [ $? -eq 0 ];then
    messages=`echo -e "MHA $subject 主从切换成功\n master:$orig_master_host --> $new_master_host \n $body \n 当前从库:$new_slave_hosts"`
    echo "$messages" | mail -s "Mysql 实例宕掉,MHA $subject 切换成功" $email >>/tmp/mailx.log 2>&1
else
    messages=`echo -e "MHA $subject 主从切换失败\n master:$orig_master_host --> $new_master_host \n $body" `
    echo "$messages" | mail -s ""Mysql 实例宕掉,MHA $subject 切换失败"" $email >>/tmp/mailx.log 2>&1
fi
薪增在线切换master脚本

在/etc/mha/scripts目录下在线切换master脚本(/etc/mha/scripts/master_ip_online_change)

代码语言:shell
复制
#!/bin/bash
source /root/.bash_profile

vip=`echo '172.28.81.100/20'`  #######设置VIP#####
key=`echo '1'`

command=`echo "$1" | awk -F = '{print $2}'`
orig_master_host=`echo "$2" | awk -F = '{print $2}'`
new_master_host=`echo "$7" | awk -F = '{print $2}'`
orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'`
new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'`

#######要求服务的网卡识别名一样,都为ens32(这里是) #######
stop_vip=`echo "ssh root@$orig_master_host /usr/sbin/ifconfig eth0:$key down"`
start_vip=`echo "ssh root@$new_master_host /usr/sbin/ifconfig eth0:$key $vip"`

if [ $command = 'stop' ];then
    echo -e "\n\n\n****************************\n"
    echo -e "Disabled thi VIP - $vip on old master: $orig_master_host \n"
    $stop_vip

    if [ $? -eq 0 ];then
        echo "Disabled the VIP successfully"
    else
        echo "Disabled the VIP failed"
    fi

    echo -e "***************************\n\n\n"
fi    

if [ $command = 'start' -o $command = 'status' ];then
    echo -e "\n\n\n*************************\n"
    echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"
    $start_vip

    if [ $? -eq 0 ];then
        echo "Enabled the VIP successfully"
    else  
        echo "Enabled the VIP failed"
    fi  

    echo -e "***************************\n\n\n"
fi
脚本添加可执行权限
代码语言:shell
复制
chmod +x /etc/mha/scripts/*
验证ssh登陆
代码语言:shell
复制
masterha_check_ssh --conf=/etc/mha/app1.cnf
Sat Apr 23 10:21:09 2022 - [debug]  Connecting via SSH from root@172.28.81.2(172.28.81.2:22) to root@172.28.81.3(172.28.81.3:22)..
Sat Apr 23 10:21:10 2022 - [debug]   ok.
Sat Apr 23 10:21:10 2022 - [debug]  Connecting via SSH from root@172.28.81.2(172.28.81.2:22) to root@172.28.81.4(172.28.81.4:22)..
Sat Apr 23 10:21:10 2022 - [debug]   ok.
Sat Apr 23 10:21:11 2022 - [debug] 
Sat Apr 23 10:21:10 2022 - [debug]  Connecting via SSH from root@172.28.81.3(172.28.81.3:22) to root@172.28.81.2(172.28.81.2:22)..
Sat Apr 23 10:21:10 2022 - [debug]   ok.
Sat Apr 23 10:21:10 2022 - [debug]  Connecting via SSH from root@172.28.81.3(172.28.81.3:22) to root@172.28.81.4(172.28.81.4:22)..
Sat Apr 23 10:21:10 2022 - [debug]   ok.
Sat Apr 23 10:21:12 2022 - [debug] 
Sat Apr 23 10:21:10 2022 - [debug]  Connecting via SSH from root@172.28.81.4(172.28.81.4:22) to root@172.28.81.2(172.28.81.2:22)..
Sat Apr 23 10:21:11 2022 - [debug]   ok.
Sat Apr 23 10:21:11 2022 - [debug]  Connecting via SSH from root@172.28.81.4(172.28.81.4:22) to root@172.28.81.3(172.28.81.3:22)..
Sat Apr 23 10:21:11 2022 - [debug]   ok.
Sat Apr 23 10:21:12 2022 - [info] All SSH connection tests passed successfully.
验证主从是否成功
代码语言:shell
复制
masterha_check_repl --conf=/etc/mha/app1.cnf
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.28.81.100/24===

Checking the Status of the script.. OK 
Sat Apr 23 10:33:59 2022 - [info]  OK.
Sat Apr 23 10:33:59 2022 - [warning] shutdown_script is not defined.
Sat Apr 23 10:33:59 2022 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

启动MHA

绑定VIP
代码语言:shell
复制
/sbin/ifconfig eth0:1 172.28.81.100/20
启动MHA
代码语言:shell
复制
mkdir /var/log/mha/app1 -p
touch /var/log/mha/app1/manager.log
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
Sat Apr 23 11:55:43 2022 - [warning] shutdown_script is not defined.
Sat Apr 23 11:55:43 2022 - [info] Set master ping interval 1 seconds.
Sat Apr 23 11:55:43 2022 - [info] Set secondary check script: masterha_secondary_check -s 172.28.81.2 -s 172.28.81.3 -s 172.28.81.4
Sat Apr 23 11:55:43 2022 - [info] Starting ping health check on 172.28.81.2(172.28.81.2:3306)..
Sat Apr 23 11:55:43 2022 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
检查集群状态
代码语言:shell
复制
masterha_check_status --conf=/etc/mha/app1.cnf
[root@mha03 mha4mysql-manager-0.58]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:13740) is running(0:PING_OK), master:172.28.81.2

自动切换测试

停止主库mysqld服务
代码语言:shell
复制
systemctl stop mysqld
在manager中查看日志

172.28.81.3成功成为master,172.28.81.2成为172.28.81.3的slave

代码语言:shell
复制
app1: MySQL Master failover 172.28.81.2(172.28.81.2:3306) to 172.28.81.3(172.28.81.3:3306) succeeded

Master 172.28.81.2(172.28.81.2:3306) is down!

Check MHA Manager logs at mha03:/var/log/mha/app1/manager.log for details.

Started automated(non-interactive) failover.
Invalidated master IP address on 172.28.81.2(172.28.81.2:3306)
Selected 172.28.81.3(172.28.81.3:3306) as a new master.
172.28.81.3(172.28.81.3:3306): OK: Applying all logs succeeded.
172.28.81.3(172.28.81.3:3306): OK: Activated master IP address.
172.28.81.4(172.28.81.4:3306): OK: Slave started, replicating from 172.28.81.3(172.28.81.3:3306)
172.28.81.3(172.28.81.3:3306): Resetting slave info succeeded.
Master failover to 172.28.81.3(172.28.81.3:3306) completed successfully.
Sat Apr 23 12:13:47 2022 - [info] Sending mail..

https://tc.ctq6.cn/tc/20220423121431.png
https://tc.ctq6.cn/tc/20220423121431.png
查看主从信息

在slave中查看主从信息,master指向了Candicate master,并且vip也指向了Candicate master

https://tc.ctq6.cn/tc/20220423122733.png
https://tc.ctq6.cn/tc/20220423122733.png
https://tc.ctq6.cn/tc/20220423122839.png
https://tc.ctq6.cn/tc/20220423122839.png

注意:(实验测试当物理故障时,没有指定shutdown_script是没用的,不切换)

修复老的master,指向新的master

1、修改manager配置文件(只针在线切换的,自动切换不会删除配置)

代码语言:shell
复制
将如下内容添加到/etc/mha/app1.conf 中
[server1]
candidate_master=1
hostname=172.28.81.2
master_binlog_dir="/data/mysql3306/binlog"

2、修改老的master,然后设置为slave

代码语言:shell
复制
systemctl restart mysqld
reset master;
CHANGE MASTER TO
MASTER_HOST='172.28.81.3',
MASTER_USER='repl',
MASTER_PASSWORD='NlfG53idFgykMrlw1w4d',
MASTER_PORT=3306,
MASTER_AUTO_POSITION = 1;
start slave;
set global read_only=1;

3、在master结点上重启监控进程

代码语言:shell
复制
# 三个结点都执行
ln -sf /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -sf /usr/local/mysql/bin/mysql /usr/bin/mysql
# 启动manager 监控进程
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &

IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.28.81.100/20===

Checking the Status of the script.. OK 
Sat Apr 23 13:47:08 2022 - [info]  OK.
Sat Apr 23 13:47:08 2022 - [warning] shutdown_script is not defined.
Sat Apr 23 13:47:08 2022 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

手动切换测试

注意点:前提,mha监控没有运行的情况下,才能进行

手动切换时,mha不能运行,需要停掉mha
代码语言:shell
复制
masterha_stop --conf=/etc/mha/app1.cnf
MHA Manager is not running on app1(2:NOT_RUNNING).
执行切换
代码语言:shell
复制
#需要填写新的master的IP
masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=172.28.81.3 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0
Sat Apr 23 13:52:35 2022 - [info]  ok.
Sat Apr 23 13:52:35 2022 - [info] Setting read_only=0 on 172.28.81.2(172.28.81.2:3306)..
Sat Apr 23 13:52:35 2022 - [info]  ok.
Sat Apr 23 13:52:35 2022 - [info] 
Sat Apr 23 13:52:35 2022 - [info] * Switching slaves in parallel..
Sat Apr 23 13:52:35 2022 - [info] 
Sat Apr 23 13:52:35 2022 - [info] -- Slave switch on host 172.28.81.4(172.28.81.4:3306) started, pid: 16483
Sat Apr 23 13:52:35 2022 - [info] 
Sat Apr 23 13:52:36 2022 - [info] Log messages from 172.28.81.4 ...
Sat Apr 23 13:52:36 2022 - [info] 
Sat Apr 23 13:52:35 2022 - [info]  Waiting to execute all relay logs on 172.28.81.4(172.28.81.4:3306)..
Sat Apr 23 13:52:35 2022 - [info]  master_pos_wait(mysql-bin.000001:2296) completed on 172.28.81.4(172.28.81.4:3306). Executed 0 events.
Sat Apr 23 13:52:35 2022 - [info]   done.
Sat Apr 23 13:52:35 2022 - [info]  Resetting slave 172.28.81.4(172.28.81.4:3306) and starting replication from the new master 172.28.81.2(172.28.81.2:3306)..
Sat Apr 23 13:52:35 2022 - [info]  Executed CHANGE MASTER.
Sat Apr 23 13:52:35 2022 - [info]  Slave started.
Sat Apr 23 13:52:36 2022 - [info] End of log messages from 172.28.81.4 ...
Sat Apr 23 13:52:36 2022 - [info] 
Sat Apr 23 13:52:36 2022 - [info] -- Slave switch on host 172.28.81.4(172.28.81.4:3306) succeeded.
Sat Apr 23 13:52:36 2022 - [info] Unlocking all tables on the orig master:
Sat Apr 23 13:52:36 2022 - [info] Executing UNLOCK TABLES..
Sat Apr 23 13:52:36 2022 - [info]  ok.
Sat Apr 23 13:52:36 2022 - [info] Starting orig master as a new slave..
Sat Apr 23 13:52:36 2022 - [info]  Resetting slave 172.28.81.3(172.28.81.3:3306) and starting replication from the new master 172.28.81.2(172.28.81.2:3306)..
Sat Apr 23 13:52:36 2022 - [info]  Executed CHANGE MASTER.
Sat Apr 23 13:52:36 2022 - [info]  Slave started.
Sat Apr 23 13:52:36 2022 - [info] All new slave servers switched successfully.
Sat Apr 23 13:52:36 2022 - [info] 
Sat Apr 23 13:52:36 2022 - [info] * Phase 5: New master cleanup phase..
Sat Apr 23 13:52:36 2022 - [info] 
Sat Apr 23 13:52:36 2022 - [info]  172.28.81.2: Resetting slave info succeeded.
Sat Apr 23 13:52:36 2022 - [info] Switching master to 172.28.81.2(172.28.81.2:3306) completed successfully.
检查

vip已经成功转到了mysql master机器上,slave机器的都变成了mysql master的从了

https://tc.ctq6.cn/tc/20220423135449.png
https://tc.ctq6.cn/tc/20220423135449.png
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2022-04-20,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • MHA 简介
    • 组件介绍
      • MHA Manager
      • MHA Node
    • 故障转移过程
      • MHA工作原理
        • MHA 工具介绍
          • MHA架构
          • MHA 实践
            • 角色规划
              • 基础环境配置
                • 关闭防火请和selinux
                • 配置源和epel源
                • 配置时间同步
              • Mysql 实例安装(master,slave,slave)
                • 下载
                • 创建mysql系统用户
                • 创建数据目录
                • 创建配置文件
                • 配置自动启动system文件
                • 初始化mysql
                • 启动Mysql实例
              • Mysql 复制配置(master,slave,slave)
                • 三个库上创建复制用户,mha管理用户
                • 主库导出数据(新搭建集群可忽略)
                • 从库导入数据
                • 在master查看状态
                • 两个从库配置只读
                • 两个从库配置主从
                • 检查主库从库主从状态
                • 测试
              • 部署 MHA软件
                • 配置免密登陆
                • 安装依赖(三个节点)
                • 安装MHA Node(三个节点)
                • 安装MHA Manager(仅在Manger管理结点安装)
                • 配置MHA
              • 启动MHA
                • 绑定VIP
                • 启动MHA
                • 检查集群状态
              • 自动切换测试
                • 停止主库mysqld服务
                • 在manager中查看日志
                • 查看主从信息
                • 修复老的master,指向新的master
              • 手动切换测试
                • 手动切换时,mha不能运行,需要停掉mha
                • 执行切换
                • 检查
            相关产品与服务
            云数据库 SQL Server
            腾讯云数据库 SQL Server (TencentDB for SQL Server)是业界最常用的商用数据库之一,对基于 Windows 架构的应用程序具有完美的支持。TencentDB for SQL Server 拥有微软正版授权,可持续为用户提供最新的功能,避免未授权使用软件的风险。具有即开即用、稳定可靠、安全运行、弹性扩缩等特点。
            领券
            问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档