关于MySQL-HA,目前有多种解决方案,比如heartbeat、drbd、mmm、共享存储,但是它们各有优缺点。heartbeat、drbd配置较为复杂,需要自己写脚本才能实现MySQL自动切换,对于不会脚本语言的人来说,这无疑是一种脑裂问题;对于mmm,生产环境中很少有人用,且mmm管理端需要单独运行一台服务器上,要是想实现高可用,就得对mmm管理端做HA,这样无疑又增加了硬件开支;对于共享存储,个人觉得MySQL数据还是放在本地较为安全,存储设备毕竟存在单点隐患。
使用MySQL双master+keepalived是一种非常好的解决方案,在MySQL-HA环境 中,MySQL互为主从关系,这样就保证了两台MySQL数据的一致性,然后用keepalived实现虚拟IP,通过keepalived自带的服务监 控功能来实现MySQL故障时自动切换。
下面,我把即将上线的一个生产环境中的架构与大家分享一下,看一下这个架构中,MySQL-HA是如何实现的,环境拓扑如下:
功能 | IP地址 |
---|---|
MySQL-VIP | 192.168.230.200 |
MySQL-master1 | 192.168.230.130 |
MySQL-master2 | 192.168.230.152 |
OS版本:CentOS 7.3
MySQL版本:5.6
Keepalived版本:1.2.7
配置开始前,先关闭双方的firewalld服务以及selinux!
一、MySQL master-master配置
1、修改MySQL配置文件
两台MySQL均如要开启binlog日志功能,开启方法:在MySQL配置文件[MySQLd]段中加上log-bin=MySQL-bin选项 两台MySQL的server-ID不能一样,默认情况下两台MySQL的serverID都是1,需将其中一台修改为2即可
Master1配置:
vim /etc/my.cnf
server-id=1
log-bin=mysql-bin //开启binlog日志功能
auto-increment-increment=2
auto-increment-offset=1
log-slave-updates
Master2配置:
vim /etc/my.cnf
server-id=2
log-bin=mysql-bin //开启binlog日志功能
auto-increment-increment=2
auto-increment-offset=2
log-slave-updates
重启两台server的mysql服务。
/etc/init.d/mysqld restart
2、将192.168.230.130设为192.168.230.152的主服务器
在Mysql-Master1上操作:
# export PATH=$PATH:/usr/local/mysql/bin/
# vim /etc/profile
# source /etc/profile
# mysql -uroot
mysql> grant replication slave on *.* to 'repl'@'192.168.230.152' identified by 'zhangduanya';
mysql> flush privileges;
将192.168.230.152设为192.168.230.130的主服务器
在Mysql-Master2上操作::
# export PATH=$PATH:/usr/local/mysql/bin/
# vim /etc/profile
# source /etc/profile
# mysql -uroot
mysql> grant replication slave on *.* to 'repl'@'192.168.230.130' identified by 'zhangduanya';
mysql> flush privileges;
在Mysql-Master1上操作:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 421 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
mysql> change master to master_host='192.168.230.152',master_port=3306,master_user='repl',master_password='zhangduanya',master_log_file='mysql-bin.000002',master_log_pos=120;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
在Mysql-Master2上操作:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 | 120 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
mysql> change master to master_host='192.168.230.130',master_port=3306,master_user='repl',master_password='zhangduanya',master_log_file='mysql-bin.000003',master_log_poos=421;
Query OK, 0 rows affected, 2 warnings (0.06 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.230.152
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 120
Relay_Log_File: zhdy-02-relay-bin.000002
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.230.130
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 421
Relay_Log_File: zhdy-03-relay-bin.000002
Relay_Log_Pos: 283
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
如上述均正确配置,现在任何一台MySQL上更新数据都会同步到另一台MySQL,MySQL同步在此不再演示。
二、keepalived安装及配置
2.1 192.168.230.130服务器上keepalived安装及配置 安装keepalived
yum install -y pcre-devel openssl-devel popt-devel #安装依赖包
cd /usr/local/src
wget http://www.keepalived.org/software/keepalived-1.2.7.tar.gz
tar -zxvf keepalived-1.2.7.tar.gz
cd ./keepalived-1.2.7
./configure --prefix=/usr/local/keepalived
make && make install
echo $?
cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
chmod +x /etc/init.d/keepalived
chkconfig --add keepalived
chkconfig keepalived on
cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
mkdir /etc/keepalived/
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
2.2 配置keepalived 我们自己在新建一个配置文件,默认情况下keepalived启动时会去/etc/keepalived目录下找配置文件。
> /etc/keepalived/keepalived.conf
vi /etc/keepalived/keepalived.conf
#写入以下内容
! Configuration File forkeepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MYSQL_HA #标识,双主相同
}
vrrp_instance VI_1 {
state BACKUP #两台都设置BACKUP
interface ens33
virtual_router_id 51 #主备相同
priority 100 #优先级,另一台改为90
advert_int 1
nopreempt #不抢占,只在优先级高的机器上设置即可,优先级低的机器不设置
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.230.200
}
}
virtual_server 192.168.230.200 3306 {
delay_loop 2 #每个2秒检查一次real_server状态
lb_algo wrr #LVS算法
lb_kind DR #LVS模式
persistence_timeout 60 #会话保持时间
protocol TCP
real_server 192.168.230.130 3306 {
weight 3
notify_down /usr/local/keepalived/mysql.sh #检测到服务down后执行的脚本
TCP_CHECK {
connect_timeout 10 #连接超时时间
nb_get_retry 3 #重连次数
delay_before_retry 3 #重连间隔时间
connect_port 3306 #健康检查端口
}
}
}
编写检测服务down后所要执行的脚本(添加/usr/local/keepalived/mysql.sh)
vim /usr/local/keepalived/mysql.sh
#!/bin/bash
pkill keepalived
sleep 10
/etc/init.d/keepalived start >/dev/null
----------
# chmod +x /usr/local/keepalived/mysql.sh
注:此脚本是上面配置文件notify_down选项所用到的,keepalived使用notify_down选项来检查real_server 的服务状态,当发现real_server服务故障时,便触发此脚本;我们可以看到,脚本就一个命令,通过pkill keepalived强制杀死keepalived进程,从而实现了MySQL故障自动转移。另外,我们不用担心两个MySQL会同时提供数据更新操作, 因为每台MySQL上的keepalived的配置里面只有本机MySQL的IP+VIP,而不是两台MySQL的IP+VIP。
启动keepalived
/etc/init.d/keepalived start
测试:
找一台局域网PC,然后去ping MySQL的VIP,这时候MySQL的VIP是可以ping的通的 停止MySQL服务,看keepalived健康检查程序是否会触发我们编写的脚本。(在没有启动keepalived服务前是不可以ping通VIP的)
2.3 192.168.230.152上keepalived安装及配置 安装keepalived,安装方法参照192.168.230.130的安装方法 配置keepalived 这台配置和上面基本一样,但有三个地方不同:优先级为90、无抢占设置、real_server为本机IP
> /etc/keepalived/keepalived.conf
vi /etc/keepalived/keepalived.conf
#写入以下内容
! Configuration File forkeepalived
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MYSQL_HA #标识,双主相同
}
vrrp_instance VI_1 {
state BACKUP #两台都设置BACKUP
interface ens33
virtual_router_id 51 #主备相同
priority 90 #优先级,此处应改为90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.230.200
}
}
virtual_server 192.168.230.200 3306 {
delay_loop 2 #每个2秒检查一次real_server状态
lb_algo wrr #LVS算法
lb_kind DR #LVS模式
persistence_timeout 60 #会话保持时间
protocol TCP
real_server 192.168.230.152 3306 {
weight 3
notify_down /usr/local/keepalived/mysql.sh #检测到服务down后执行的脚本
TCP_CHECK {
connect_timeout 10 #连接超时时间
nb_get_retry 3 #重连次数
delay_before_retry 3 #重连间隔时间
connect_port 3306 #健康检查端口
}
}
}
编写检测服务down后所要执行的脚本(添加/usr/local/keepalived/mysql.sh)
vim /usr/local/keepalived/mysql.sh
#!/bin/bash
pkill keepalived
sleep 10
/etc/init.d/keepalived start >/dev/null
----------
# chmod +x /usr/local/keepalived/mysql.sh
启动keepalived
/etc/init.d/keepalived start
2.4 MySQL故障转移测试:
停止MySQL服务,看keepalived健康检查程序是否会触发我们编写的脚本。
目前VIP在Mysql-Master1上面:
我现在停掉Mysql-Master1的mysql服务!
[[email protected]02 keepalived-1.2.7]# !ps
ps -aux | grep keepalived
root 7276 0.0 0.1 115244 1436 ? S 23:03 0:00 /bin/bash /usr/local/keepalived/mysql.sh
root 7292 0.0 0.0 112652 964 pts/1 R+ 23:03 0:00 grep --color=auto keepalived
已经正常的执行了脚本,而且vip也顺利的转到了另外一台机器!
因为我们的检测脚本只是简单的直接杀死keepalived,当我们恢复mysql服务时,同时还需要启动keepalived。
三、再次测试
3.1 本次测试我们将是用其他客户端连接,所以我们需要在master和backup上的mysql授权root远程登录。
mysql> grant all on *.* to'root'@'192.168.230.%' identified by 'zhangduanya';
mysql> flush privileges;
客户端连接的MySQL的VIP,在切换时我执行了一个MySQL查询命令,从执行show databases到显示出结果时间为3-5秒(大家可以看到上面有个错误提示,不过不用担心,因为我们的keepalived切换大概为3秒左右,这3 秒左右VIP是谁都不属于的)
3.2 keepalived故障转移测试
※在windows客户端一直去ping VIP,然后关闭192.168.230.130上的keepalived,正常情况下VIP就会切换到192.168.230.152上面去。
※开启192.168.130.130上的keepalived,关闭192.168.230.152上的keepalived,看是否能自动切换,正常情况下VIP又会属于192.168.230.130 注:keepalived切换速度还是非常块的,整个切换过程只需1-3秒。
四、总结:
世间万事万物,都不具备绝对的完美。就像上面的MySQL-HA一样,keepalived只能做到对3306的健康检查,但是做不到比如像 MySQL复制中的slave-SQL、slave-IO进程的检查。所以要想做到一些细致的健康检查,还得需要借助额外的监控工具,比如nagios, 然后用nagios或者zabbix实现短信、邮件报警,从而能够有效地解决问题。