基于腾讯云CVM构建MySQL MHA Keepalived集群

1 环境介绍

1.1 MHA简单介绍

MHA是由perl语言编写的一款开源的MySQL的高可用程序,为MySQL主从复制架构提供了automating master failover功能。MHA可以自动检测mysql是否宕机,如果宕机,在10-30s内完成new master的选举,应用所有差异的binlog日志到所有slave,将所有的slave切换到新的master上来。

由于MHA本身只负责数据库主从的切换,但是应用程序并不知道数据库的master变了。针对这种情况,可以使用MHA预留的几个脚本接口,通过虚拟IP或者修改全局配置文件的方法通知应用程序,master数据库已经改变。

MHA服务有两种角色,MHA Manager(管理节点)和HMA Node(数据节点):

(1)MHA Manager:通常部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称作一个application;

(2)MHA node:运行在每台MySQL服务器上(master/slave/manager),他通过监控解析和清理logs功能的脚本来加快故障转移。

1.2 环境部署

如下图所示,使用5台腾讯云centos 7.5 CVM搭建MHA集群环境。准备集群环境,安装依赖包,授权用户,配置ssh密钥对认证登陆,所有节点之间互相以root秘钥对认证登录,管理主机以root密钥对认证登录所有数据节点主机,配置mha集群。

IP规划如下:

角色

IP地址

主机名

安装软件包

MHA-master主节点服务器

10.10.2.7

master2-7

Mysql,MHA node,keepalived

MHA-备份master1服务器

10.10.2.12

master2-12

Mysql,MHA node,keepalived

MHA-备份master2服务器

10.10.2.11

master2-11

Mysql,MHA node,keepalived

MHA-slave从节点服务器

10.10.2.9

slave2-9

Mysql,MHA node

MHA-manager管理节点服务器

10.10.2.17

mgm2-17

MHA manager

VIP地址

10.10.2.100

2 MHA集群环境搭建

2.1 环境初始化配置

(1)修改hostname,退出重新登录生效(所有的数据节点和MHA管理节点同样操作,以10.10.2.7为例)

[root@VM_2_7_centos ~]# hostnamectl set-hostname master2-7
[root@VM_2_7_centos ~]# exit

(2)centos 7默认安装mariadb,查看并卸载,否则安装MySQL时会冲突

[root@master2-7 ~]# rpm -qa | grep mariadb
mariadb-libs-5.5.56-2.el7.x86_64
[root@master2-7 ~]# rpm -e --nodeps mariadb-libs

2.2 MySQL主从搭建

步骤一:安装MySQL软件包

(1)安装mysql时可能会缺少某些依赖包,需提前单独安装

[root@master2-7 ~]# yum -y install perl-Data-Dumper  perl-JSON  perl-Time-HiRes numactl

(2)创建MySQL目录

[root@master2-7 ~]# mkdir  -p /usr/mysql
[root@master2-7 ~]# cd /usr/mysql/

(3)wget下载mysql-5.7.25.tar 整合包

[root@master2-7 mysql]# wget wget https://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-5.7.25-1.el7.x86_64.rpm-bundle.tar

(4)解压mysql-5.7.25.tar 整合包到/usr/mysql下

[root@master2-7 mysql]# tar xvf mysql-5.7.25-1.el7.x86_64.rpm-bundle.tar
mysql-community-libs-5.7.25-1.el7.x86_64.rpm
mysql-community-libs-compat-5.7.25-1.el7.x86_64.rpm
mysql-community-embedded-devel-5.7.25-1.el7.x86_64.rpm
mysql-community-embedded-5.7.25-1.el7.x86_64.rpm
mysql-community-client-5.7.25-1.el7.x86_64.rpm
mysql-community-server-5.7.25-1.el7.x86_64.rpm
mysql-community-embedded-compat-5.7.25-1.el7.x86_64.rpm
mysql-community-test-5.7.25-1.el7.x86_64.rpm
mysql-community-devel-5.7.25-1.el7.x86_64.rpm
mysql-community-common-5.7.25-1.el7.x86_64.rpm

(5)安装MySQL

#按照以下顺序进行安装,因为它们之间存在依赖关系
common --> libs --> clients --> server --> libs-compat
[root@master2-7 mysql]# rpm -ivh mysql-community-common-5.7.25-1.el7.x86_64.rpm
warning: mysql-community-common-5.7.25-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-common-5.7.25-1.e################################# [100%]

[root@master2-7 mysql]# rpm -ivh mysql-community-libs-5.7.25-1.el7.x86_64.rpm
warning: mysql-community-libs-5.7.25-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-libs-5.7.25-1.el7################################# [100%]
 
[root@master2-7 mysql]# rpm -ivh mysql-community-client-5.7.25-1.el7.x86_64.rpm
warning: mysql-community-client-5.7.25-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-client-5.7.25-1.e################################# [100%]
   
[root@master2-7 mysql]# rpm -ivh mysql-community-server-5.7.25-1.el7.x86_64.rpm
warning: mysql-community-server-5.7.25-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-server-5.7.25-1.e################################# [100%]
   
[root@master2-7 mysql]# rpm -ivh mysql-community-libs-compat-5.7.25-1.el7.x86_64.rpm
warning: mysql-community-libs-compat-5.7.25-1.el7.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:mysql-community-libs-compat-5.7.2################################# [100%]

(6)安装好后,可以去检查是否存在MySQL的配置文件或者MySQL的命令在/usr/bin下

[root@master2-7 mysql]# ls /etc/my.cnf
/etc/my.cnf
[root@master2-7 mysql]# ls /usr/bin/ | grep mysql
mysql
mysqladmin
mysqlbinlog
mysqlcheck
mysql_config_editor
mysqld_pre_systemd
mysqldump
mysqldumpslow
mysqlimport
mysql_install_db
mysql_plugin
mysqlpump
mysql_secure_installation
mysqlshow
mysqlslap
mysql_ssl_rsa_setup
mysql_tzinfo_to_sql
mysql_upgrade

步骤二:启动MySQL数据库服务并设置开机自启

(1)启动MySQL数据库服务并设置开机自启

[root@master2-7 mysql]# systemctl start mysqld
[root@master2-7 mysql]# systemctl enable mysqld

(2)查看MySQL服务状态

[root@master2-7 mysql]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2019-04-03 11:09:23 CST; 38s ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 3965 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─3965 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

Apr 03 11:09:17 master2-7 systemd[1]: Starting MySQL Server...
Apr 03 11:09:23 master2-7 systemd[1]: Started MySQL Server.
[root@master2-7 mysql]# ps -ef | grep mysqld
mysql     3965     1  0 11:09 ?        00:00:00 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
root      4119  1399  0 11:10 pts/0    00:00:00 grep --color=auto mysqld

步骤三:连接MySQL服务器,修改密码

(1)mysql 5.7有默认密码,需要查看随机生成的root管理密码

[root@master2-7 mysql]# cat /var/log/mysqld.log  | grep password
2019-04-03T03:09:18.427243Z 1 [Note] A temporary password is generated for root@localhost: S?!SPerVi4or
2019-04-03T03:12:09.621739Z 2 [Note] Access denied for user 'root'@'localhost' (using password: NO)

(2)使用客户端命令mysql连接到MySQL服务器

[root@master2-7 mysql]# mysql -uroot -p'S?!SPerVi4or'

(3)用该密码登录到服务端后,必须马上修改密码,不然会报如下错误:

mysql> show databases;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

(4)执行SET PASSWORD命令修改密码

这个其实与validate_password_policy的值有关,默认为1,所以刚开始设置的密码必须符合长度,且必须含有数字,小写或大写字母,特殊字符。如果不希望密码设置的那么复杂,需要修改两个全局参数:validate_password_policy与validate_password_length。validate_password_length默认值为8,最小值为4,如果你显性指定validate_password_length的值小于4,尽管不会报错,但validate_password_length的值将设为4。

可参考下列指令(命令行操作重启失效):

mysql> set global validate_password_policy=0;   #只验证长度
Query OK, 0 rows affected (0.00 sec)

mysql> set global validate_password_length=6;   #修改密码长度,默认值是8个字符
Query OK, 0 rows affected (0.00 sec)

mysql> alter user user() identified by "123456";    #修改登陆密码
Query OK, 0 rows affected (0.00 sec)

(5)退出重新登录

mysql> exit
Bye
[root@master2-7 mysql]# mysql -uroot -p123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.25 MySQL Community Server (GPL)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

步骤四:配置主从同步

(1)MHA-master(10.10.2.7)主节点数据库服务器配置文件

[root@master2-7 ~]# vim /etc/my.cnf
[mysqld]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
validate_password_policy=0
validate_password_length=6
log-bin=master2-7
binlog-format='mixed'
server-id=7  #确保每台不相同
[root@master2-7 mysql]# systemctl restart mysqld
[root@master2-7 ~]# mysql -uroot -p123456
mysql> grant  replication slave  on  *.*  to repluser@"%"  identified by "123456";   #添加主从同步授权用户,10.10.2.12和2.11也需要有此授权用户,否则在MHA将备切换成主后,主从同步失败
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;

mysql> show master status;  #查看master当前使用的binlog文件和position
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| master2-7.000002 |      441 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

(2)MHA-备份master1数据库服务器(10.10.2.12)配置文件

[root@master2-12 ~]# vim /etc/my.cnf
[mysqld]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
validate_password_policy=0
validate_password_length=6
log-bin=master2-12
binlog-format='mixed'
server-id=12
read_only=1
[root@master2-12 mysql]# systemctl restart mysqld
[root@master2-12 ~]# mysql -uroot -p123456
mysql> change master to
    -> master_host='10.10.2.7',
    -> master_user='repluser',
    -> master_password='123456',
    -> master_log_file='master2-7.000002',
    -> master_log_pos=441;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...

(3)MHA-备份master2数据库服务器(10.10.2.11)配置文件

[root@master2-11 ~]# vim /etc/my.cnf
[mysqld]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
validate_password_policy=0
validate_password_length=6
log-bin=master2-11
binlog-format='mixed'
server-id=11
read_only=1
[root@master2-11 mysql]# systemctl restart mysqld
[root@master2-11 ~]# mysql -uroot -p123456
mysql> change master to
    -> master_host='10.10.2.7',
    -> master_user='repluser',
    -> master_password='123456',
    -> master_log_file='master2-7.000002',
    -> master_log_pos=441;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...

(4)MHA-slave从节点数据库服务器(10.10.2.9)配置文件

[root@master2-9 ~]# vim /etc/my.cnf
[mysqld]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
validate_password_policy=0
validate_password_length=6
log-bin=master2-9
binlog-format='mixed'
server-id=9
read_only=1
[root@master2-9 mysql]# systemctl restart mysqld
[root@master2-9 ~]# mysql -uroot -p123456
mysql> change master to
    -> master_host='10.10.2.7',
    -> master_user='repluser',
    -> master_password='123456',
    -> master_log_file='master2-7.000002',
    -> master_log_pos=441;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
...
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...

3 配置MHA的集群环境

3.1 给MHA管理节点授权管理的账号和密码

(1)在的master2-7主机(10.10.2.7)上做root的授权,其他的会同步(如果不做,MHA管理节点在验证数据节点的主从同步配置时会出错)

[root@master2-7 ~]# mysql -uroot -p123456

mysql> grant all on *.* to mha@'10.10.2.%' identified by 'mhapass';
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> flush privileges;

(2)在其他数据节点验证账号和密码是否同步,可以看到主从同步账号repluser和MHA管理账号mha,这就就OK了

mysql> select user,host from mysql.user;
+---------------+-----------+
| user          | host      |
+---------------+-----------+
| repluser      | %         |
| mha           | 10.10.2.% |
| mysql.session | localhost |
| mysql.sys     | localhost |
| root          | localhost |
+---------------+-----------+
5 rows in set (0.00 sec)

3.2 MHA安装

(1)管理节点(在管理主机10.10.2.17上安装mha_node 和 mha-manager包)

步骤一:下载MHA软件包到本地

[root@mgm2-17 ~]# wget https://teamblog-1251829167.cos.ap-beijing.myqcloud.com/%E5%9F%BA%E4%BA%8E%E8%85%BE%E8%AE%AF%E4%BA%91CVM%E6%9E%84%E5%BB%BAMHA%20MySQL%E9%9B%86%E7%BE%A4/mha-soft.zip

--2019-04-03 19:39:52--  https://teamblog-1251829167.cos.ap-beijing.myqcloud.com/%E5%9F%BA%E4%BA%8E%E8%85%BE%E8%AE%AF%E4%BA%91CVM%E6%9E%84%E5%BB%BAMHA%20MySQL%E9%9B%86%E7%BE%A4/mha-soft.zip
Resolving teamblog-1251829167.cos.ap-beijing.myqcloud.com (teamblog-1251829167.cos.ap-beijing.myqcloud.com)... 140.143.114.236, 140.143.117.199, 154.8.145.20, ...
Connecting to teamblog-1251829167.cos.ap-beijing.myqcloud.com (teamblog-1251829167.cos.ap-beijing.myqcloud.com)|140.143.114.236|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 499351 (488K) [application/zip]
Saving to: ‘mha-soft.zip’

100%[==================================================================================================================================================================>] 499,351     1.30MB/s   in 0.4s

2019-04-03 19:39:53 (1.30 MB/s) - ‘mha-soft.zip’ saved [499351/499351]

步骤二:安装mha_node

[root@mgm2-17 ~]# unzip mha-soft.zip
[root@mgm2-17 ~]# cd mha-soft/
[root@mgm2-17 mha-soft]# yum -y install  perl-*.rpm     #Perl依赖包
[root@mgm2-17 mha-soft]# yum -y  install perl-DBD-mysql   perl-DBI

[root@mgm2-17 mha-soft]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:mha4mysql-node-0.56-0.el6        ################################# [100%]
   
[root@mgm2-17 mha4mysql-manager-0.56]# yum -y  install perl-ExtUtils-*   perl-CPAN-*

步骤三:安装mha_manager

[root@mgm2-17 mha-soft]# cd mha4mysql-manager-0.56/
[root@mgm2-17 mha4mysql-manager-0.56]# yum -y  install perl-ExtUtils-*   perl-CPAN-*
[root@mgm2-17 mha4mysql-manager-0.56]# perl  Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI                   ...loaded. (1.627)
- DBD::mysql            ...loaded. (4.023)
- Time::HiRes           ...loaded. (1.9725)
- Config::Tiny          ...loaded. (2.14)
- Log::Dispatch         ...missing.
- Parallel::ForkManager ...missing.
- MHA::NodeConst        ...loaded. (0.56)
==> Auto-install the 2 mandatory module(s) from CPAN? [y] y
*** Dependencies will be installed the next time you type 'make'.
*** Module::AutoInstall configuration finished.     #配置完成
Checking if your kit is complete...
Looks good
Warning: prerequisite Log::Dispatch 0 not found.
Warning: prerequisite Parallel::ForkManager 0 not found.
Writing Makefile for mha4mysql::manager
Writing MYMETA.yml and MYMETA.json

[root@mgm2-17 mha4mysql-manager-0.56]# make
[root@mgm2-17 mha4mysql-manager-0.56]# make install

步骤四:配置管理主机

[root@mgm2-17 mha4mysql-manager-0.56]# cp bin/* /usr/local/bin/     #提示覆盖,说明安装的时候有,没有可以拷贝过来

[root@mgm2-17 mha4mysql-manager-0.56]# mkdir /etc/mha_manager   #创建工作目录

[root@mgm2-17 mha4mysql-manager-0.56]# cp samples/conf/app1.cnf  /etc/mha_manager    #拷贝样板文件

[root@mgm2-17 mha4mysql-manager-0.56]# cp samples/scripts/master_ip_failover /etc/mha_manager/   #拷贝监控脚本
#编辑主配置文件etc/mha_manager/app1.cnf
[root@mgm2-17 mha4mysql-manager-0.56]# vim /etc/mha_manager/app1.cnf    

[server default]
manager_workdir=/etc/mha_manager
manager_log=/etc/mha_manager/manager.log
master_ip_failover_script=/etc/mha_manager/master_ip_failover

ssh_user=mha   #指定ssh的用户名和端口,用普通用户mha,同时授权免秘钥互相登录的也是该普通用户
ssh_port=22
repl_user=repluser  #主从同步用户名和密码
repl_password=123456
user=mha    #数据库用户名和密码
password=mhapass

[server1]
hostname=10.10.2.7

[server2]
hostname=10.10.2.12
candidate_master=1

[server3]
hostname=10.10.2.11
candidate_master=1

[server4]
hostname=10.10.2.9
no_master=1

(2)数据节点(在所有数据节点服务器上安装mha-node包,以10.10.2.7为例,10.10.2.12和10.10.2.11和10.10.2.9一样操作)

[root@master2-7 ~]# unzip mha-soft.zip
[root@master2-7 ~]# cd mha-soft/
[root@master2-7 mha-soft]# yum -y install  perl-*.rpm     #Perl依赖包
[root@master2-7 mysql]# yum -y  install perl-DBD-mysql   perl-DBI

[root@master2-7 mha-soft]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:mha4mysql-node-0.56-0.el6        ################################# [100%]

3.3 配置SSH免秘钥登录

(1)所有数据节点之间可以以普通账号mha互相ssh免秘钥登录(以10.10.2.7为例,其他数据节点配置类似)

[root@master2-7 ~]# useradd mha 
[root@master2-7 ~]# passwd mha
[root@master2-7 ~]# usermod -g mysql  mha   #!!!添加mha为MySQL同组用户,使其有读取MySQL binary/relay log文件和relay_log.info文件的权限,以及日志目录的写权限(用于MHA manager远程连接访问)

[root@master2-7 ~]# su - mha
[mha@master2-7 ~]$ mkdir /home/mha/.ssh
[mha@master2-7 ~]$ chmod 700 /home/mha/.ssh/    #!!!注意查看.ssh目录的权限,是否是 700 (drwx --- ---), 不是的话一定要修改为700
[mha@master2-7 ~]$ cd /home/mha/.ssh
[mha@master2-7 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/mha/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/mha/.ssh/id_rsa.
Your public key has been saved in /home/mha/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:8MarJqr+jiOhTKZ5Cv/OiQuvZb3bqmTkBTtiOIAlO7I mha@master2-7
The key's randomart image is:
+---[RSA 2048]----+
|. .              |
|.+               |
|* .   .          |
|+o o   +         |
|E.+ .   S        |
|oB +   . .       |
|Oo* .   .        |
|*@oo.+..         |
|*B@OO=o          |
+----[SHA256]-----+
#除了传给10.10.2.12之外,2.11和2.9也需要传
[mha@master2-7 ~]$ ssh-copy-id mha@10.10.2.12
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/mha/.ssh/id_rsa.pub"
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
mha@10.10.2.12's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'mha@10.10.2.12'"
and check to make sure that only the key(s) you wanted were added.

(2)配置管理节点(10.10.2.17)通过普通账号免秘钥ssh登录所有数据节点主机

[root@mgm2-17 ~]# useradd mha
[root@mgm2-17 ~]# passwd mha
[root@mgm2-17 ~]# chown -R mha:mha /etc/mha_manager/    #!!!增加manager_workdir(运行MySQL实例服务时的工作目录权限),其中会生成日志文件,将该目录拥有者设为mha

[root@mgm2-17 ~]# su - mha
[mha@mgm2-17 ~]$ cd /home/mha/
[mha@mgm2-17 ~]$ mkdir .ssh
[mha@mgm2-17 ~]$ chmod 700 .ssh/
[mha@mgm2-17 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/mha/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/mha/.ssh/id_rsa.
Your public key has been saved in /home/mha/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:6Y7IrBiXmI/ycbpMimPLxi8BPg+Nx2bqcUKI/t6LKZY mha@mgm2-17
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|                 |
|+        .       |
|=.+     S        |
|oO B   .         |
|=+/..   .        |
|=E=B+. o         |
|@*&O+oo .        |
+----[SHA256]-----+
#除传给10.10.2.7之外,还要传给2.12、2.11和2.9
[mha@mgm2-17 ~]$ ssh-copy-id mha@10.10.2.7
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/mha/.ssh/id_rsa.pub"
The authenticity of host '10.10.2.7 (10.10.2.7)' can't be established.
ECDSA key fingerprint is SHA256:TB99bnsOvYcDO1h6Pdtr0+/pTHVCr4BJOODV17fzd+0.
ECDSA key fingerprint is MD5:98:9f:f5:3c:25:bc:14:e4:7f:c5:fd:74:08:da:aa:5d.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
mha@10.10.2.7's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'mha@10.10.2.7'"
and check to make sure that only the key(s) you wanted were added.

3.4 在mha_manager上配置failover切换脚本

failover时,控制VIP转移的脚本。通常有两种方式实现:

方案一:通过命令 "/sbin/ifconfig eth0:1 10.10.2.100"

在master_ip_failover中的关键配置如下:

my $vip = "10.10.2.100/24";
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

方案二:结合keepalived 来实现(推荐方案,下面以这种方案来演示)

/etc/mha_manager/master_ip_failover添加或者修改的内容意思是当主库数据库发生故障时,会触发MHA切换,MHA Manager会停掉主库上的keepalived服务,触发虚拟ip漂移到备选从库,从而完成切换。当然可以在keepalived里面引入脚本,这个脚本监控mysql是否正常运行,如果不正常,则调用该脚本杀掉keepalived进程。

failover切换脚如下:

[root@mgm2-17 mha_manager]# vim /etc/mha_manager/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);

my $vip = '10.10.2.100';
my $ssh_start_vip = "/etc/init.d/keepalived start";
my $ssh_stop_vip = "/etc/init.d/keepalived stop";

GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
);

exit &main();

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}
sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
#A simple system call that disable the VIP on the old_master
sub stop_vip() {
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

3.5 配置keepalived的配置文件

(1)在腾讯云控制台申请HAVIP

image

(2)在master上配置(master2-7 10.10.2.7)操作如下:

[root@master2-7 keepalived-1.2.19]# cat /etc/keepalived/keepalived.conf

global_defs {
   notification_email {
   root@localhost
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id keepalivedha_1
}

vrrp_instance VI_1 {
    #state MASTER
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 150
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
    10.10.2.100/24
    }
}

(3)在候选master1上配置(master2-12 10.10.2.12)操作如下:

[root@master2-12 keepalived-1.2.19]# cat /etc/keepalived/keepalived.conf

global_defs {
   notification_email {
   root@localhost
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id keepalivedha_2
}

vrrp_instance VI_1 {
    #state MASTER
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 120
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
    10.10.2.100/24
    }
}

(4)在候选master2上配置(master2-11 10.10.2.11)操作如下:

[root@master2-11 keepalived-1.2.19]# cat /etc/keepalived/keepalived.conf

global_defs {
   notification_email {
   root@localhost
   }
   notification_email_from keepalived@localhost
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id keepalivedha_3
}

vrrp_instance VI_1 {
    #state MASTER
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 120
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
    10.10.2.100/24
    }
}

3.6 启动keepalived并查看IP绑定情况

(1)启动keepalived服务

!!!注意:只需要在master上手动启动,后续MHA管理节点在切换主从的时候会通过failover脚本在对应的主机上开启和关闭keepalived服务

[root@master2-7 keepalived-1.2.19]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]

(2)查看master VIP是否绑定成功

#发现已经将虚拟IP 10.10.2.100绑定了master2-7 10.10.2.7的网卡eth0上了,确认keepalived已经配置成功
[root@master2-7 keepalived-1.2.19]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 52:54:00:57:d0:be brd ff:ff:ff:ff:ff:ff
    inet 10.10.2.7/24 brd 10.10.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.10.2.100/24 scope global secondary eth0
       valid_lft forever preferred_lft forever

4 测试MHA集群故障转移

(1)启动服务

--remove_dead_master_conf //删除宕机主库配置

--ignore_last_failover //忽略xxx.health文件

[mha@mgm2-17 ~]$ masterha_manager --conf=/etc/mha_manager/app1.cnf

Fri Apr  5 23:04:15 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Apr  5 23:04:15 2019 - [info] Reading application default configuration from /etc/mha_manager/app1.cnf..
Fri Apr  5 23:04:15 2019 - [info] Reading server configuration from /etc/mha_manager/app1.cnf..

(2)查看状态(另开一个终端),看到master为10.10.2.7

[mha@mgm2-17 ~]$ masterha_check_status  --conf=/etc/mha_manager/app1.cnf
app1 (pid:12973) is running(0:PING_OK), master:10.10.2.7

(3)测试vip是否可用

步骤一:在当前主库master2-7上创建表并授权用户

[mha@master2-7 ~]$ mysql -uroot -p123456

mysql> create database mhatest1;
Query OK, 1 row affected (0.01 sec)

mysql> grant all on mhatest1.* to webuser@'%' identified by "123456";
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> use mhatest1;
Database changed
mysql>
mysql> create table t1 (id int);
Query OK, 0 rows affected (0.05 sec)

mysql> insert into t1 values(1111);
Query OK, 1 row affected (0.02 sec)

mysql> insert into t1 values(2222);
Query OK, 1 row affected (0.00 sec)

mysql>

步骤二:在管理节点mgm2-17上用VIP连mysql并插入数据,验证

[mha@mgm2-17 ~]$ mysql -h10.10.2.100 -uwebuser -p123456

MySQL [(none)]> select @@hostname;
+------------+
| @@hostname |
+------------+
| master2-7  |
+------------+
1 row in set (0.00 sec)

MySQL [(none)]> select * from mhatest1.t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
+------+
2 rows in set (0.00 sec)

MySQL [(none)]> use mhatest1
MySQL [mhatest1]>  insert into t1 values(3333);
Query OK, 1 row affected (0.01 sec)

MySQL [mhatest1]> select * from mhatest1.t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
| 3333 |
+------+
3 rows in set (0.00 sec)

MySQL [mhatest1]>

步骤三:在从库上验证

#在主库备份1上验证
[mha@master2-12 ~]$ mysql -uroot -p123456

mysql> select * from mhatest1.t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
| 3333 |
+------+
3 rows in set (0.00 sec)

mysql>
#在主库备份2上验证
[mha@master2-11 ~]$ mysql -uroot -p123456

mysql> select * from mhatest1.t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
| 3333 |
+------+
3 rows in set (0.00 sec)
#在从库上验证
[mha@master2-9 ~]$ mysql -uroot -p123456

mysql> select * from mhatest1.t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
| 3333 |
+------+
3 rows in set (0.00 sec)

(4)测试MHA高可用(故障切换)

步骤一:手动停止当前主master2-7的mysql服务

[root@master2-7 ~]# systemctl stop mysqld

步骤二:查看监控节点服务状态

[mha@mgm2-17 ~]$ masterha_manager --conf=/etc/mha_manager/app1.cnf --remove_dead_master_conf --ignore_last_failover

Thu Apr  4 16:46:19 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Apr  4 16:46:19 2019 - [info] Reading application default configuration from /etc/mha_manager/app1.cnf..
Thu Apr  4 16:46:19 2019 - [info] Reading server configuration from /etc/mha_manager/app1.cnf..
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to master2-7.000002
Thu Apr  4 16:46:34 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Apr  4 16:46:34 2019 - [info] Reading application default configuration from /etc/mha_manager/app1.cnf..
Thu Apr  4 16:46:34 2019 - [info] Reading server configuration from /etc/mha_manager/app1.cnf..
#查看日志,看到已经做了failover,master切换到10.10.2.12上
[mha@mgm2-17 ~]$ tail -5 /etc/mha_manager/manager.log

Generating relay diff files from the latest slave succeeded.
10.10.2.9(10.10.2.9:3306): OK: Applying all logs succeeded. Slave started, replicating from 10.10.2.12(10.10.2.12:3306)
10.10.2.11(10.10.2.11:3306): OK: Applying all logs succeeded. Slave started, replicating from 10.10.2.12(10.10.2.12:3306)
10.10.2.12(10.10.2.12:3306): Resetting slave info succeeded.
Master failover to 10.10.2.12(10.10.2.12:3306) completed successfully.

步骤三:查看vip是否漂移

#VIP已经漂移到master2-12上
[mha@master2-12 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 52:54:00:2a:96:c7 brd ff:ff:ff:ff:ff:ff
    inet 10.10.2.12/24 brd 10.10.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.10.2.100/24 scope global secondary eth0
       valid_lft forever preferred_lft forever

步骤四:测试客户端连接

[mha@mgm2-17 ~]$ mysql -h10.10.2.100 -uwebuser -p123456

MySQL [(none)]> select @@hostname;
+------------+
| @@hostname |
+------------+
| master2-12 |
+------------+
1 row in set (0.00 sec)

MySQL [(none)]>

步骤五:测试新的主库读写

failover将master2-12切换成主之后,自动将数据库设置成可读写read_only=0

MySQL [(none)]> select @@hostname;
+------------+
| @@hostname |
+------------+
| master2-12 |
+------------+
1 row in set (0.00 sec)

MySQL [mhatest1]> show variables like '%read_only%';
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_read_only      | OFF   |
| read_only             | OFF   |
| super_read_only       | OFF   |
| transaction_read_only | OFF   |
| tx_read_only          | OFF   |
+-----------------------+-------+
5 rows in set (0.00 sec)

MySQL [(none)]> use mhatest1

MySQL [mhatest1]> show tables;
+--------------------+
| Tables_in_mhatest1 |
+--------------------+
| t1                 |
+--------------------+
1 row in set (0.00 sec)

MySQL [mhatest1]> insert into t1 values(4444);
Query OK, 1 row affected (0.01 sec)

MySQL [mhatest1]> select * from t1;
+------+
| id   |
+------+
| 1111 |
| 2222 |
| 3333 |
| 4444 |
+------+
4 rows in set (0.00 sec)

步骤六:测试从库的主从同步状态

#查看新主库master2-12的状态
[mha@master2-12 ~]$ mysql -uroot -p123456

mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master2-12.000002 |      446 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
#查看备份主库master2-11的主从同步状态,正常
[mha@master2-11 ~]$ mysql -uroot -p123456

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.2.12
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master2-12.000002
          Read_Master_Log_Pos: 446
               Relay_Log_File: master2-11-relay-bin.000002
                Relay_Log_Pos: 613
        Relay_Master_Log_File: master2-12.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 446
              Relay_Log_Space: 825
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 12
                  Master_UUID: 73251ba0-55d8-11e9-a67c-5254002a96c7
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified
#查看从库slave2-9的主从同步状态,正常
[mha@master2-9 ~]$ mysql -uroot -p123456

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.2.12
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master2-12.000002
          Read_Master_Log_Pos: 446
               Relay_Log_File: slave2-9-relay-bin.000002
                Relay_Log_Pos: 613
        Relay_Master_Log_File: master2-12.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 446
              Relay_Log_Space: 823
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 12
                  Master_UUID: 73251ba0-55d8-11e9-a67c-5254002a96c7
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

步骤七:将修复的服务器重新添加到集群中(这一步需要人为干预)

!!!master_log_pos这个值很重要,在生产环境中,在宕机的这段时间一定有新数据的写入,如果仅仅是取当前master的position的话,会导致主从数据不一致,一定要取宕机时刻对应的master上的position值!!!

[root@master2-7 ~]# systemctl start mysqld
[root@master2-7 ~]# mysql -uroot -p123456

mysql>  reset slave;
Query OK, 0 rows affected (0.02 sec)


mysql> change master to master_host='10.10.2.12',
    -> master_user='repluser',
    -> master_password="123456",
    ->  master_log_file='master2-12.000002',
    -> master_log_pos=446;
Query OK, 0 rows affected, 2 warnings (0.03 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)


mysql> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.2.12
                  Master_User: repluser
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master2-12.000002
          Read_Master_Log_Pos: 446
               Relay_Log_File: master2-7-relay-bin.000002
                Relay_Log_Pos: 321
        Relay_Master_Log_File: master2-12.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 446
              Relay_Log_Space: 532
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 12
                  Master_UUID: 73251ba0-55d8-11e9-a67c-5254002a96c7
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

mysql>

步骤八:将修复的master2-7添加回集群配置(这一步需要人为干预)

(1)需要在配置文件app1.cnf里重新将master2-7的信息添加进来

image

(2)重新测试主从是否健康

[mha@mgm2-17 ~]$ masterha_check_repl --conf=/etc/mha_manager/app1.cnf
Fri Apr  5 23:19:38 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Apr  5 23:19:38 2019 - [info] Reading application default configuration from /etc/mha_manager/app1.cnf..
Fri Apr  5 23:19:38 2019 - [info] Reading server configuration from /etc/mha_manager/app1.cnf..
Fri Apr  5 23:19:38 2019 - [info] MHA::MasterMonitor version 0.56.
Fri Apr  5 23:19:39 2019 - [info] GTID failover mode = 0
Fri Apr  5 23:19:39 2019 - [info] Dead Servers:
Fri Apr  5 23:19:39 2019 - [info] Alive Servers:
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.7(10.10.2.7:3306)
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.12(10.10.2.12:3306)
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.11(10.10.2.11:3306)
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.9(10.10.2.9:3306)
Fri Apr  5 23:19:39 2019 - [info] Alive Slaves:
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.7(10.10.2.7:3306)  Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Fri Apr  5 23:19:39 2019 - [info]     Replicating from 10.10.2.12(10.10.2.12:3306)
Fri Apr  5 23:19:39 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.11(10.10.2.11:3306)  Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Fri Apr  5 23:19:39 2019 - [info]     Replicating from 10.10.2.12(10.10.2.12:3306)
Fri Apr  5 23:19:39 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Apr  5 23:19:39 2019 - [info]   10.10.2.9(10.10.2.9:3306)  Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Fri Apr  5 23:19:39 2019 - [info]     Replicating from 10.10.2.12(10.10.2.12:3306)
Fri Apr  5 23:19:39 2019 - [info]     Not candidate for the new Master (no_master is set)
Fri Apr  5 23:19:39 2019 - [info] Current Alive Master: 10.10.2.12(10.10.2.12:3306)
Fri Apr  5 23:19:39 2019 - [info] Checking slave configurations..
Fri Apr  5 23:19:39 2019 - [info]  read_only=1 is not set on slave 10.10.2.7(10.10.2.7:3306).
Fri Apr  5 23:19:39 2019 - [warning]  relay_log_purge=0 is not set on slave 10.10.2.7(10.10.2.7:3306).
Fri Apr  5 23:19:39 2019 - [warning]  relay_log_purge=0 is not set on slave 10.10.2.11(10.10.2.11:3306).
Fri Apr  5 23:19:39 2019 - [warning]  relay_log_purge=0 is not set on slave 10.10.2.9(10.10.2.9:3306).
Fri Apr  5 23:19:39 2019 - [info] Checking replication filtering settings..
Fri Apr  5 23:19:39 2019 - [info]  binlog_do_db= , binlog_ignore_db=
Fri Apr  5 23:19:39 2019 - [info]  Replication filtering check ok.
Fri Apr  5 23:19:39 2019 - [info] GTID (with auto-pos) is not supported
Fri Apr  5 23:19:39 2019 - [info] Starting SSH connection tests..
Fri Apr  5 23:19:42 2019 - [info] All SSH connection tests passed successfully.
Fri Apr  5 23:19:42 2019 - [info] Checking MHA Node version..
Fri Apr  5 23:19:43 2019 - [info]  Version check ok.
Fri Apr  5 23:19:43 2019 - [info] Checking SSH publickey authentication settings on the current master..
Fri Apr  5 23:19:43 2019 - [info] HealthCheck: SSH to 10.10.2.12 is reachable.
Fri Apr  5 23:19:43 2019 - [info] Master MHA Node version is 0.56.
Fri Apr  5 23:19:43 2019 - [info] Checking recovery script configurations on 10.10.2.12(10.10.2.12:3306)..
Fri Apr  5 23:19:43 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master2-12.000005
Fri Apr  5 23:19:43 2019 - [info]   Connecting to mha@10.10.2.12(10.10.2.12:22)..
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to master2-12.000005
Fri Apr  5 23:19:43 2019 - [info] Binlog setting check done.
Fri Apr  5 23:19:43 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Fri Apr  5 23:19:43 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=10.10.2.7 --slave_ip=10.10.2.7 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.25-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Fri Apr  5 23:19:43 2019 - [info]   Connecting to mha@10.10.2.7(10.10.2.7:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to master2-7-relay-bin.000004
    Temporary relay log file is /var/lib/mysql/master2-7-relay-bin.000004
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Apr  5 23:19:44 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=10.10.2.11 --slave_ip=10.10.2.11 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.25-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Fri Apr  5 23:19:44 2019 - [info]   Connecting to mha@10.10.2.11(10.10.2.11:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to master2-11-relay-bin.000004
    Temporary relay log file is /var/lib/mysql/master2-11-relay-bin.000004
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Apr  5 23:19:44 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=10.10.2.9 --slave_ip=10.10.2.9 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.25-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Fri Apr  5 23:19:44 2019 - [info]   Connecting to mha@10.10.2.9(10.10.2.9:22)..
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to slave2-9-relay-bin.000004
    Temporary relay log file is /var/lib/mysql/slave2-9-relay-bin.000004
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Apr  5 23:19:44 2019 - [info] Slaves settings check done.
Fri Apr  5 23:19:44 2019 - [info]
10.10.2.12(10.10.2.12:3306) (current master)
 +--10.10.2.7(10.10.2.7:3306)
 +--10.10.2.11(10.10.2.11:3306)
 +--10.10.2.9(10.10.2.9:3306)

Fri Apr  5 23:19:44 2019 - [info] Checking replication health on 10.10.2.7..
Fri Apr  5 23:19:44 2019 - [info]  ok.
Fri Apr  5 23:19:44 2019 - [info] Checking replication health on 10.10.2.11..
Fri Apr  5 23:19:44 2019 - [info]  ok.
Fri Apr  5 23:19:44 2019 - [info] Checking replication health on 10.10.2.9..
Fri Apr  5 23:19:44 2019 - [info]  ok.
Fri Apr  5 23:19:44 2019 - [info] Checking master_ip_failover_script status:
Fri Apr  5 23:19:44 2019 - [info]   /etc/mha_manager/master_ip_failover --command=status --ssh_user=mha --orig_master_host=10.10.2.12 --orig_master_ip=10.10.2.12 --orig_master_port=3306


IN SCRIPT TEST====/etc/init.d/keepalived stop==/etc/init.d/keepalived start===

Checking the Status of the script.. OK
Fri Apr  5 23:19:44 2019 - [info]  OK.
Fri Apr  5 23:19:44 2019 - [warning] shutdown_script is not defined.
Fri Apr  5 23:19:44 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

(3)重新加载MHA配置文件(很关键!!!)

一旦发生切换管理进程(Manager)将会退出,无法进行再次测试,需将故障数据库解决掉之后,重新change加入到MHA环境中来,并且要保证app1.failover.complete不存在或则加上--ignore_last_failover参数忽略,才能再次开启管理进程。

[mha@mgm2-17 ~]$ masterha_manager --conf=/etc/mha_manager/app1.cnf
Fri Apr  5 23:21:03 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Apr  5 23:21:03 2019 - [info] Reading application default configuration from /etc/mha_manager/app1.cnf..
Fri Apr  5 23:21:03 2019 - [info] Reading server configuration from /etc/mha_manager/app1.cnf..
#验证MHA状态
[mha@mgm2-17 ~]$ masterha_check_status --conf=/etc/mha_manager/app1.cnf
app1 (pid:14901) is running(0:PING_OK), master:10.10.2.12

5 总结

通过以上操作,可以成功在腾讯云通过CVM搭建MySQL MHA Keepalived高可用方案,这种自建MHA集群方案可控性强,但是带来的运维成本和复杂度会很高。腾讯云CDB高可用集群也是基于类似方案来实现,但是切换速度更快、数据一致性有保障,建议使用腾讯云CDB产品(腾讯云CDB详细介绍)。

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区