前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >MySQL多线程复制报错案例

MySQL多线程复制报错案例

作者头像
SEian.G
发布2021-07-29 14:25:50
4390
发布2021-07-29 14:25:50
举报
文章被收录于专栏:SEian.G学习记录SEian.G学习记录

近期遇到一个主从复制报错的问题,具体的报错详情如下所示:

代码语言:javascript
复制
mysql >show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: xxxxx
                  Master_User: dba_repl
                  Master_Port: 4306
                Connect_Retry: 60
              Master_Log_File: mysqlbin.001589
          Read_Master_Log_Pos: 353490645
               Relay_Log_File: slave-relay-bin.004764
                Relay_Log_Pos: 912732994
        Relay_Master_Log_File: mysqlbin.001588
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1864
                   Last_Error: Cannot schedule event Rows_query, relay-log name /data/mysql_4306/log/slave-relay-bin.004764, position 912733141 to Worker thread because its size 21520792 exceeds 18777088 of slave_pending_jobs_size_max.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 912732823
              Relay_Log_Space: 1427234531
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1864
               Last_SQL_Error: Cannot schedule event Rows_query, relay-log name /data/mysql_4306/log/slave-relay-bin.004764, position 912733141 to Worker thread because its size 21520792 exceeds 18777088 of slave_pending_jobs_size_max.
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 255
                  Master_UUID: 548723ca-1f7f-11e9-b3ab-005056b748c5
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 210514 16:02:41
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 548723ca-1f7f-11e9-b3ab-005056b748c5:532252-1257990582
            Executed_Gtid_Set: 548723ca-1f7f-11e9-b3ab-005056b748c5:1-1257510314,
5965cc1e-42b8-11e8-9759-005056b7d9af:1-2805913654
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)

从报错信息看,和参数slave_pending_jobs_size_max有关,那么这个参数具体是干什么的呢?我们今天就来了解一下;

下面是官方文档对该参数的介绍: For multithreaded replicas, this variable sets the maximum amount of memory (in bytes) available to worker queues holding events not yet applied. Setting this variable has no effect on replicas for which multithreading is not enabled. Setting this variable has no immediate effect. The state of the variable applies on all subsequent START SLAVE commands. The minimum possible value for this variable is 1024; the default is 16MB. The maximum possible value is 18446744073709551615 (16 exabytes). Values that are not exact multiples of 1024 are rounded down to the next-highest multiple of 1024 prior to being stored. The value of this variable is a soft limit and can be set to match the normal workload. If an unusually large event exceeds this size, the transaction is held until all the worker threads have empty queues, and then processed. All subsequent transactions are held until the large transaction has been completed.

大概的意思是:

对于多线程复制,slave_pending_jobs_size_max变量设置用于保存尚未应用的event的工作队列可用的最大内存量(以字节为单位)。设置此变量对未启用多线程处理的复制没有影响。设置此变量不会立即生效。必须要停掉复制之后,重新start slave。 此变量的最小值为1024;默认值为16MB。最大可能值为18446744073709551615(16 EB)。 此变量的值是软限制,可以设置为与正常工作负载匹配。如果异常大的事件超过此大小,事务将被保留,直到所有工作线程都有空队列,然后进行处理。如果内存富余,或者延迟较大时,可以适当调大;注意这个值要比主库的max_allowed_packet大!

划重点:

该参数在多线程复制中起作用, 当worker线程正在处理的event的总大小超过slave_pending_jobs_size_max变量的大小时,将发生此等待操作。此时可有在主库看到线程的状态为:Waiting for Slave Workers to free pending events 当event大小降至该限制以下时,协调器将恢复调度。仅当slave_parallel_workers设置为大于0时,才会出现此状态。

那么对该参数了解之后,具体的解决方法也就有了:

1、查看主库max_allowed_packet的大小

代码语言:javascript
复制
mysql> show variables like '%max_allowed_packet%';
+--------------------------+------------+
| Variable_name            | Value      |
+--------------------------+------------+
| max_allowed_packet       | 67108864   |
| slave_max_allowed_packet | 1073741824 |
+--------------------------+------------+
2 rows in set (0.00 sec)

2、设置从库slave_pending_jobs_size_max的大小,注意,需要大于主库max_allowed_packet的大小

代码语言:javascript
复制
mysql>stop slave;
Query OK, 0 rows affected (0.01 sec)
 
mysql>set global slave_pending_jobs_size_max=1073741824;
Query OK, 0 rows affected (0.00 sec)
 
mysql>start slave;
Query OK, 0 rows affected (0.02 sec)

到此,主从复制报错的问题就解决了;

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-07-07,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DBA的辛酸事儿 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档