MHA 源码阅读第03期：MasterRotate

数据库交流

发布于 2022-04-25 08:35:30

2490

发布于 2022-04-25 08:35:30

文章被收录于专栏：悦专栏

作者简介

无为，前饿了么 MySQL DBA，现就职于某知名互联网公司，对 MySQL、 Redis、PostgrepSQL 等主流数据库有一定了解，拥有丰富的一线运维经验。

上期我们已经详细了解了 MHA manager 的 MasterFailover 原理，这一期我们继续探索 MHA 核心功能之一的 MasterRotate。

先聊聊非 GTID 模式的：

1 调用链路

MHA::MasterRotate::main --> do_master_online_switch() 
        --> identify_orig_master()
        --> identify_new_master() 
        --> reject_update()  
        --> read_slave_status() 
        --> switch_master()
        --> switch_slaves() 
        --> release_failover_advisory_lock()

2 代码详解

MHA::MasterRotate::identify_orig_master();

MHA::MasterRotate::read_config() 读取配置文件。
MHA::MasterRotate::connect_all_and_read_server_status() 检测确认各个 Node 节点 MySQL 是否可以连接检查是否存在 server down ，若有则退出 rotate 检查 master 是否存活，若 dead 则退出 rotate。
MHA::MasterRotate::check_repl_priv() 查看用户是否有 replication 的权限。
MHA::MasterRotate::get_monitor_advisory_lock() 获取 monitor_advisory_lock，以保证当前没有其他的 monitor 进程在 master 上运行。
MHA::MasterRotate::get_failover_advisory_lock() 获取 failover_advisory_lock，以保证当前没有其他的 failover 进程在 slave上运行。
MHA::MasterRotate::check_replication_health() SHOW SLAVE STATUS 来判断如下状态：current_slave_position/has_replication_problem has_replication_problem 检查：IO线程/SQL线程/Seconds_Behind_Master(1s)。
MHA::MasterRotate::get_running_update_threads() 使用 show processlist 来查询当前有没有执行 update 的线程存在，若有则退出switch。

MHA::MasterRotate::identify_new_master();

`MHA::MasterRotate::set_latest_slaves() 当前的 slave 节点都是 latest slave。
MHA::MasterRotate::select_new_master() 选举 new master。

MHA::MasterRotate::reject_update  /*加锁防止binlog写*/

release_monitor_advisory_lock() 通过 SELECT RELEASE_LOCK('MHA_Master_High_Availability_Monitor') As Value 释放锁。如果 master_ip_online_change_script 脚本存在调用 master_ip_online_change_script 脚本。
lock_all_tables() 执行 FLUSH TABLES WITH READ LOCK，来 lock table
check_binlog_stop() 连续两次 show master status，来判断写 binlog 是否已经停止。

MHA::MasterRotate::read_slave_status()

check_slave_status 通过 show slave status 获取 slave 状态。

MHA::MasterRotate::switch_master()

switch_master_internal() master_pos_wait：调用 select master_pos_wait 函数，等待主从同步完成 get_new_master_binlog_position：通过 show master status 来获取。
Allow write access on the new master() 调用 master_ip_online_change_script --command=start ...，将 vip 指向 new master。
disable_read_only() 在新 master 上执行：SET GLOBAL read_only=0。

MHA::MasterRotate::switch_slaves();

switch_slaves_internal() 各从库执行change master 和 start slave 操作。
unlock_tables() new master 上执行 unlock table。
reset_slave_on_new_master() new master 上执行 reset master。

MHA::MasterRotate::release_failover_advisory_lock()

调用 release_failover_advisory_lock() 函数释放 failover 锁。

3 GTID 模式

GTID 模式的 online switch 和 non-GTID 流程一样，除了在 change_master_and_start_slave 不用获取具体的位点信息。

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2020-10-28，如有侵权请联系 cloudcommunity@tencent.com 删除

sql

数据库

云数据库 SQL Server

本文分享自悦专栏微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

sql

数据库

云数据库 SQL Server

登录后参与评论

0 条评论

热度

MHA 源码阅读第03期：MasterRotate

MHA 源码阅读第03期：MasterRotate

1 调用链路

2 代码详解

3 GTID 模式

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

MHA 源码阅读 第03期：MasterRotate

MHA 源码阅读 第03期：MasterRotate

1 调用链路

2 代码详解

3 GTID 模式

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

MHA 源码阅读第03期：MasterRotate

MHA 源码阅读第03期：MasterRotate