MySQL Replication是MySQL非常重要的特性。用好了,可以发挥很大的作用,做负载均衡,做读写分离,做备份等等,能在关键时刻救DBA一命;用不好,那就是给DBA自己找麻烦了,处理不尽的故障。所以我这边给大家分享两个关于复制的案例。
mysql> use test;Database changedmysql> create table tt( -> id int auto_increment primary key, -> int_a int, -> int_b int);Query OK, 0 rows affected (0.13 sec)mysql> insert into tt(`int_a`,`int_b`) values(1,1);Query OK, 1 row affected (0.14 sec)mysql> select * from tt;+----+-------+-------+| id | int_a | int_b |+----+-------+-------+| 1 | 1 | 1 |+----+-------+-------+1 row in set (0.00 sec)
mysql> select * from tt;+----+-------+-------+| id | int_a | int_b |+----+-------+-------+| 1 | 1 | 1 |+----+-------+-------+1 row in set (0.00 sec)mysql> show slave status\G*************************** 1. row *************************** Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 526 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 689 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes .......................................部分信息省略 Exec_Master_Log_Pos: 526 Relay_Log_Space: 862 Master_SSL_Allowed: No Seconds_Behind_Master: 0 Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 3656 Master_UUID: b9e1f845-5d32-11e7-b7eb-fa163ea44438 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 01 row in set (0.00 sec)
mysql> update tt set int_b = 2 where int_a = 1;Query OK, 1 row affected (0.02 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> select * from tt;+----+-------+-------+| id | int_a | int_b |+----+-------+-------+| 1 | 1 | 2 |+----+-------+-------+1 row in set (0.00 sec)
mysql> update tt set int_a = 2 where int_b = 1;Query OK, 1 row affected (0.02 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> select * from tt;+----+-------+-------+| id | int_a | int_b |+----+-------+-------+| 1 | 2 | 1 |+----+-------+-------+1 row in set (0.00 sec)
mysql> select * from tt;+----+-------+-------+| id | int_a | int_b |+----+-------+-------+| 1 | 1 | 2 |+----+-------+-------+1 row in set (0.00 sec)mysql> show slave status\G*************************** 1. row *************************** Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 749 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 912 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes .......................................部分信息省略 Skip_Counter: 0 Exec_Master_Log_Pos: 749 Relay_Log_Space: 1085 Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 3656 Master_UUID: b9e1f845-5d32-11e7-b7eb-fa163ea44438 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 01 row in set (0.00 sec)
在测试步骤中我们可以看到,在从库更新数据之后,主库上的更新操作在从库上没有生效,但是查看复制状态一切正常。仅从show slave status\G中查看到的信息,我们认为目前主从的复制是正常的,但是考虑实际的数据,主从的数据已经不一致了。
看到主库的更新操作没有在从库上应用,首先考虑,这个事务的binlog是否真的被从库接收到。于是检查从库上的relay log,使用mysqlbinlog工具解析relay log,看到:
# at 689#170705 13:50:49 server id 3656 end_log_pos 605 CRC32 0xe7ffbc45 Query thread_id=9 exec_time=0 error_code=0SET TIMESTAMP=1499233849/*!*/;BEGIN/*!*/;# at 768#170705 13:50:49 server id 3656 end_log_pos 718 CRC32 0xa9da82f4 Query thread_id=9 exec_time=0 error_code=0SET TIMESTAMP=1499233849/*!*/;update tt set int_a = 2 where int_b = 1/*!*/;# at 881#170705 13:50:49 server id 3656 end_log_pos 749 CRC32 0xcdb5c9d7 Xid = 107COMMIT/*!*/;
从relay log中可以看到,主库上的更新操作在从库上是接收到了的。接着根据show slave status\G
的信息,也可以确定该事务是被sql线程应用了的。再仔细一看这个 relay log 发现,这个 update 操作是被以STATEMENT的格式保存下来,并复制到从库。所以在从库上只是简单的执行这个语句。并且因为从库上int_b=1
的记录已经被修改为int_b=2
,从而在从库上执行这个语句的时候,找不到符合相应条件的记录需要修改。
这个更新操作是执行了的,只是没有找到符合where条件的记录。所以 show slave status\G
查看复制状态也是正常。但是主从数据不一致了。
所以,在复制架构中一定要强调不要随便在从库上执行insert、update、delete等操作,因为极有可能做了相应的操作之后,主从数据不一致,复制状态正常。应用查询数据出现异常,问题很难排查。
mysql> create database gangshen;Query OK, 1 row affected (0.02 sec)mysql> use gangshenDatabase changedmysql> create table tt( -> id int, -> name varchar(20), -> primary key(id,name));Query OK, 0 rows affected (0.09 sec)mysql> show create table tt;+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Table | Create Table |+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+| tt | CREATE TABLE `tt` ( `id` int(11) NOT NULL DEFAULT '0', `name` varchar(20) NOT NULL DEFAULT '', PRIMARY KEY (`id`,`name`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 |+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)
mysql> use gangshen;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql> show create table tt;+-------+------------------------------------------------------------------------------------------------------------------------------------------------+| Table | Create Table |+-------+------------------------------------------------------------------------------------------------------------------------------------------------+| tt | CREATE TABLE `tt` ( `id` int(11) NOT NULL, `name` varchar(20) NOT NULL, PRIMARY KEY (`id`,`name`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 |+-------+------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)mysql> show slave status\G*************************** 1. row *************************** Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 372 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 569 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Exec_Master_Log_Pos: 372 Relay_Log_Space: 776 .......................................部分信息省略 Seconds_Behind_Master: 0 Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 3656 Master_UUID: b9e1f845-5d32-11e7-b7eb-fa163ea44438 Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 01 row in set (0.00 sec)
mysql> alter table tt modify column id int null;Query OK, 0 rows affected (0.03 sec)Records: 0 Duplicates: 0 Warnings: 0mysql> show create table tt;+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Table | Create Table |+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+| tt | CREATE TABLE `tt` ( `id` int(11) NOT NULL DEFAULT '0', `name` varchar(20) NOT NULL DEFAULT '', PRIMARY KEY (`id`,`name`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 |+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)
mysql> show create table tt;+-------+------------------------------------------------------------------------------------------------------------------------------------------------+| Table | Create Table |+-------+------------------------------------------------------------------------------------------------------------------------------------------------+| tt | CREATE TABLE `tt` ( `id` int(11) NOT NULL, `name` varchar(20) NOT NULL, PRIMARY KEY (`id`,`name`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 |+-------+------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)mysql> show slave status\G*************************** 1. row *************************** Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 494 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 569 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No .......................................部分信息省略 Last_Errno: 1171 Last_Error: Error 'All parts of a PRIMARY KEY must be NOT NULL; if you need NULL in a key, use UNIQUE instead' on query. Default database: 'gangshen'. Query: 'alter table tt modify column id int null' Skip_Counter: 0 Exec_Master_Log_Pos: 372 Seconds_Behind_Master: NULL Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1171 Last_SQL_Error: Error 'All parts of a PRIMARY KEY must be NOT NULL; if you need NULL in a key, use UNIQUE instead' on query. Default database: 'gangshen'. Query: 'alter table tt modify column id int null' Replicate_Ignore_Server_Ids: Master_Server_Id: 3656 Master_UUID: b9e1f845-5d32-11e7-b7eb-fa163ea44438 Master_Info_File: mysql.slave_master_info Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 170705 14:39:41 Auto_Position: 01 row in set (0.00 sec)
从以上测试步骤中可以看到,在复制正常的情况下,主库上执行DDL提示没有错误,在从库上执行会有一个错误,提示说主键的字段必须非空,如果你要在一个索引中使用NULL属性,那应该使用唯一索引替代主键索引使用。
因为主库为5.6.36版本,从库为5.7.18版本,所以很容易考虑说是不是因为主从数据库版本不一致的原因。但是具体是因为5.6和5.7中什么的不同导致的问题,需要接着分析。
可看到我们在主库上执行DDL的语句的时候,执行成功了,但是查看 show create table tt;
语句,可以看到这个DDL语句并没有起作用,所以这个DDL语句在5.6版本中是被忽略了。
我们直接拿这个DDL语句在5.7的数据库上执行,直接就报错了
mysql> use gangshenReading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql> alter table tt modify column id int null;ERROR 1171 (42000): All parts of a PRIMARY KEY must be NOT NULL; if you need NULL in a key, use UNIQUE instead
检查主库上的binlog日志以及从库上的relay log,都能看到DDL语句是被记录了的
# at 569#170705 14:39:37 server id 3656 end_log_pos 494 CRC32 0x9de05dcd Query thread_id=11 exec_time=0 error_code=0SET TIMESTAMP=1499236777/*!*/;alter table tt modify column id int null/*!*/;
可以说明这句DDL语句是被正常复制的,但是该语句在5.6主库上执行的时候,操作被忽略了。DDL语句被复制到5.7从库上执行的时候,因为5.7不允许该操作,所以SQL线程在重放该操作的时候报错,导致SQL线程中断。
本文分享自 MySQLBeginner 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!