MySQL 5.7 Group Replication错误总结(r11笔记第84天)

今天来总结下MySQL 5.7中的一些问题处理,相对来说常规一些。搭建的过程我就不用多说了,昨天的文章分分钟搭建MySQL Group Replication测试环境(r11笔记第82天)里面可以看到一个基本的方式,在测试环境很容易模拟,如果在多台物理机环境中搭建是不是也一样呢,答案是肯定的,我自己都一一试过了。

因为搭建的环境官方建议也是single_primary的方式,即一主写入,其它做读,也就是读写分离,当然支持multi_primary理论上也是可行的,但是还是有点小问题,我们就以single_primary来举例。

问题1:

读节点加入组的时候,start group_replication抛出了下面的错误。基本碰到这个错误,你离搭建成功就不远了。

2017-02-20T07:56:30.064556Z 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 89328c79-f730-11e6-ab63-782bcb377193:1-2 > Group transactions: 7c744904-f730-11e6-a72d-782bcb377193:1-4' 2017-02-20T07:56:30.064580Z 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.' 2017-02-20T07:56:30.064587Z 0 [Note] Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option' 可以很明显看到日志中已经提示了,需要设置参数,也就是兼容加入组。group_replication_allow_local_disjoint_gtids_join设置完成后运行start group_replication即可。

问题2:

如果碰到这个错误,也不用太担心,可以从日志看到是因为参数的不兼容性导致的。比如主写设置为multi-primary,读节点设置为single-primary,统一一下即可。

2017-02-21T10:20:56.324890+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 87b9c8fe-f352-11e6-bb33-0026b935eb76:1-5, b79d42f4-f351-11e6-9891-0026b935eb76:1, f7c7b9f8-f352-11e6-b1de-a4badb1b524e:1 > Group transactions: 87b9c8fe-f352-11e6-bb33-0026b935eb76:1-5, b79d42f4-f351-11e6-9891-0026b935eb76:1' 2017-02-21T10:20:56.324971+08:00 0 [ERROR] Plugin group_replication reported: 'The member configuration is not compatible with the group configuration. Variables such as single_primary_mode or enforce_update_everywhere_checks must have the same value on every server in the group. (member configuration option: [], group configuration option: [group_replication_single_primary_mode]).' 2017-02-21T10:20:56.325052+08:00 19 [Note] Plugin group_replication reported: 'Going to wait for view modification' 2017-02-21T10:20:56.325594+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 53d187f2'

问题3:

这个问题困扰了我很久,其实本质上就是节点的设置,里面有一个group_name, 这个名字可以不能设置为每个节点的uuid,比如节点1,2,3这几个节点,group_replication_group_name是需要一致的。之前每次失败都会认认真真拷贝uuid,发现适得其反。

2017-02-22T14:46:35.819072Z 0 [Warning] Plugin group_replication reported: 'read failed' 2017-02-22T14:46:35.851829Z 0 [ERROR] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 24902' 2017-02-22T14:47:05.814080Z 30 [ERROR] Plugin group_replication reported: 'Timeout on wait for view after joining group' 2017-02-22T14:47:05.814183Z 30 [Note] Plugin group_replication reported: 'Requesting to leave the group despite of not being a member' 2017-02-22T14:47:05.814213Z 30 [ERROR] Plugin group_replication reported: '[GCS] The member is leaving a group without being on one.' 2017-02-22T14:47:05.814567Z 30 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1' 2017-02-22T14:47:05.814583Z 30 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1' 2017-02-22T14:47:05.814859Z 36 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed 2017-02-22T14:47:05.815720Z 33 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'统一之后,启动的过程其实很快。

mysql> start group_replication; Query OK, 0 rows affected (1.52 sec)

基本上搭建过程就这几类问题,还有主机名类的问题,这方面还有一些小的bug,如果需要特别设置,还可以指定report_host来完成。

问题4:

环境搭建好之后,我们来创建一个普通的表,有时候好的习惯和规范在这种时候就尤其重要。

创建表test_tab

create table test_tab (id int,name varchar(30));然后插入一条数据,看起来这是一个再正常不过的操作,但是在MGR里面就会有错误,因为一个基本要求就是表中含有主键。

mysql> insert into test_tab values(1,'a'); ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.修复的方式就是添加主键:

mysql> alter table test_tab add primary key(id); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0

问题5(模拟灾难):

我们目前搭建的是single-primary的模式。如果主写节点发生故障,整个group该怎么处理呢,就会优先把第二个节点S2省纪委主写。

要测试的话还是很简单的。我们把节点1的服务直接kill掉。看看主节点会漂移到哪里。

首先是组复制的基本情况,目前存在5个节点,我们直接kill节点1,即端口为24801的。

+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 52d26194-f90a-11e6-a247-782bcb377193 | grtest      |       24801 | ONLINE       |
| group_replication_applier | 5abaaf89-f90a-11e6-b4de-782bcb377193 | grtest      |       24802 | ONLINE       |
| group_replication_applier | 655248b9-f90a-11e6-86b4-782bcb377193 | grtest      |       24803 | ONLINE       |
| group_replication_applier | 6defc92c-f90a-11e6-990c-782bcb377193 | grtest      |       24804 | ONLINE       |
| group_replication_applier | 76bc07a1-f90a-11e6-ab0a-782bcb377193 | grtest      |       24805 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

节点2会输出下面的日志,意味值这个节点正式上岗了。

2017-02-22T14:59:45.157989Z 0 [Note] Plugin group_replication reported: 'getstart group_id 98e4de29'
2017-02-22T14:59:45.434062Z 0 [Note] Plugin group_replication reported: 'Unsetting super_read_only.'
2017-02-22T14:59:45.434130Z  40 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all  relay logs'

然后就会看到组复制的情况成了下面的局面,毫无疑问,第一个节点被剔除了。

+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 5abaaf89-f90a-11e6-b4de-782bcb377193 | grtest      |       24802 | ONLINE       |
| group_replication_applier | 655248b9-f90a-11e6-86b4-782bcb377193 | grtest      |       24803 | ONLINE       |
| group_replication_applier | 6defc92c-f90a-11e6-990c-782bcb377193 | grtest      |       24804 | ONLINE       |
| group_replication_applier | 76bc07a1-f90a-11e6-ab0a-782bcb377193 | grtest      |       24805 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

从日志我们可以看到是第二个节点升为主写了,那么问题来了。

问题6:

怎么判断一个复制组中哪个是主节点,不能完全靠猜或者翻看日志来判断吧。

我们用下面的语句来过滤得到。

mysql> select *from  performance_schema.replication_group_members  where member_id =(select variable_value from  performance_schema.global_status WHERE VARIABLE_NAME=  'group_replication_primary_member');
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 5abaaf89-f90a-11e6-b4de-782bcb377193 | grtest      |       24802 | ONLINE       |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)

原文发布于微信公众号 - 杨建荣的学习笔记(jianrong-notes)

原文发表时间:2017-02-22

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏一只程序汪的自我修养

每周开源项目分享-年轻人的第一个OAuth2.0 Server:hydra

文档地址:https://www.ory.sh/docs/guides/master/hydra/

2792
来自专栏实战docker

pinpoint插件开发之一:牛刀小试,调整gson插件

从本章开始我们一起来实战pinpoint插件开发,做一些实用的pinpoint插件,本着先易后难的原则,我们从修改现有插件开始吧; 准备工作 本次实战的操作环境...

3365
来自专栏从零开始的linux

分布式文件系统 fastdfs

fastdfs FastDFS是一款开源的轻量级分布式文件系统纯C实现,支持Linux、FreeBSD等UNIX系统类google FS,不是通用的文件系统,只...

6345
来自专栏IT笔记

SpringBoot开发案例之微信小程序文件上传

最近在做一个口语测评的小程序服务端,小程序涉及到了音频文件的上传,按理说应该统一封装一个第三方上传接口服务提供给前段调用,但是开发没有那么多道理,暂且为了省事就...

7157
来自专栏乐沙弥的世界

基于Linux (RHEL 5.5) 安装Oracle 10g RAC

    本文所描述的是在Red Hat 5.5下使用vmware server 来安装Oracle 10g RAC(OCFS + ASM),本文假定你的RHEL...

1463
来自专栏Hadoop实操

如何在CDH集群安装Anaconda&搭建Python私有源

Anaconda是一个用于科学计算的Python发行版,支持 Linux, Mac, Windows系统,提供了包管理与环境管理的功能,可以很方便地解决多版本p...

1.5K8
来自专栏快乐八哥

Angular企业级开发(4)-ngResource和REST介绍

一、RESTful介绍 ? RESTful维基百科 REST(表征性状态传输,Representational State Transfer)是Roy Fie...

1937
来自专栏王磊的博客

Spring Boot 最佳实践(五)Spring Data JPA 操作 MySQL 8

JPA(Java Persistence API)Java持久化API,是 Java 持久化的标准规范,Hibernate是持久化规范的技术实现,而Spring...

2592
来自专栏运维小白

10.25 systemd管理服务

Linux系统服务管理-systemd systemctl list-units --all --type=service 几个常用的服务相关的命令 syste...

2057
来自专栏向治洪

android studio JNI使用

Step: 1. 添加native接口注意写好native接口和System.loadLibrary()即可了,并无特别之处。 Step: 2.执行Bu...

2128

扫码关注云+社区

领取腾讯云代金券