我有一个由3台percona xtradb 5.5.34-55服务器组成的集群,因为它们都是可写的,所以在任何负载下都会出现死锁错误。增加wsrep_retry_autocommit
变量在一定程度上对此有帮助,但ER_LOCK_DEADLOCK
并没有完全消失。因此,我尝试将wsrep_retry_autocommit
设置为10000 (似乎是最大值),认为这会使一些查询变得非常慢,但使用ER_LOCK_DEADLOCK
时它们都不会失败
mysql-shm -ss -e 'show global variables like "%wsrep_retry_auto%"'
wsrep_retry_autocommit 10000
------------------------
LATEST DETECTED DEADLOCK
------------------------
140414 10:29:23
*** (1) TRANSACTION:
TRANSACTION 72D8, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s), undo log entries 1
MySQL thread id 34, OS thread handle 0x7f11840d4700, query id 982 localhost shm update
REPLACE INTO metric(host, name, userid, sampleid, type, priority) VALUES
('localhost','cpu-3/cpu-nice',8,0,0,0),('localhost','cpu-3/cpu-system',8,0,0,0),
('localhost','cpu-3/cpu-idle',8,0,0,0),('localhost','cpu-3/cpu-wait',8,0,0,0),
('localhost','cpu-3/cpu-interrupt',8,0,0,0),('localhost','cpu-3/cpu-softirq',8,0,0,0),
('localhost','cpu-3/cpu-steal',8,0,0,0),('localhost','cpu-4/cpu-user',8,0,0,0),
('localhost','cpu-4/cpu-nice',8,0,0,0),('localhost','cpu-4/cpu-system',8,0,0,0),
('localhost','cpu-4/cpu-idle',8,0,0,0),('localhost','cpu-4/cpu-wait',8,0,0,0),
('localhost','cpu-4/cpu-interrupt',8,0,0,0),('localhost','cpu-4/cpu-softirq',8,0,0,0),
('localhost','cpu-4/cpu-steal',8,0,0,0)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 344 n bits 488 index `unique-metric` of
table `shm`.`metric` trx id 72D8 lock_mode X waiting
*** (2) TRANSACTION:
TRANSACTION 72D7, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
7 lock struct(s), heap size 3112, 141 row lock(s), undo log entries 40
MySQL thread id 50, OS thread handle 0x7f1184115700, query id 980 localhost shm update
REPLACE INTO metric(host, name, userid, sampleid, type, priority) VALUES
('localhost','cpu-3/cpu-nice',8,0,0,0),('localhost','cpu-3/cpu-system',8,0,0,0),
('localhost','cpu-3/cpu-idle',8,0,0,0),('localhost','cpu-3/cpu-wait',8,0,0,0),
('localhost','cpu-3/cpu-interrupt',8,0,0,0),('localhost','cpu-3/cpu-softirq',8,0,0,0),
('localhost','cpu-3/cpu-steal',8,0,0,0),('localhost','cpu-4/cpu-user',8,0,0,0),
('localhost','cpu-4/cpu-nice',8,0,0,0),('localhost','cpu-4/cpu-system',8,0,0,0),
('localhost','cpu-4/cpu-idle',8,0,0,0),('localhost','cpu-4/cpu-wait',8,0,0,0),
('localhost','cpu-4/cpu-interrupt',8,0,0,0),('localhost','cpu-4/cpu-softirq',8,0,0,0),
('localhost','cpu-4/cpu-steal',8,0,0,0),('localhost','cpu-3/cpu-nice',8,0,0,0),
('localhost','cpu-3/cpu-system',8,0,0,0),('localhost','cpu-3/cpu-idle',8,0,0,0),
('localhost','cpu-3/cpu-wait',8,0,0,0),('localhost','cpu-3/cpu-interrupt',8,0,0,0),
('localhost','cpu-3/cpu-softirq',8,0,0,0),('localhost'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 344 n bits 488 index `unique-metric` of table
`shm`.`metric` trx id 72D7 lock_mode X
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 344 n bits 504 index `unique-metric` of table
`shm`.`metric` trx id 72D7 lock_mode X locks gap before rec insert intention waiting
*** WE ROLL BACK TRANSACTION (1)
难道不应该重试吗?有没有办法验证percona真的重试了10000次查询?
发布于 2014-07-15 17:48:27
你的问题没有确切的答案,但对于任何写密集型负载(如果你试图插入与可恶的Drupal相同的数据),那么死锁就会发生,我唯一的解决方案(仍在等待确认这是100% OK解决方案)-是在galera节点前面使用haproxy,并定义要使用的第一个节点(haproxy后端定义),以及要用作备份的其他2个节点。
这样,所有mysql流量都将从客户端通过haproxy流向单个galera节点,如果该节点出现故障,将使用其他节点。
希望这能帮上忙。安德里加
发布于 2018-11-13 16:28:08
在您的回答中,可伸缩性是一个问题,因为我们在一个集群中,但只使用一个节点是真正糟糕的资源使用。因此,替代方案是,您可以使用任何负载均衡器,如果它是haproxy,您可以在两个端口上创建2个侦听器,例如3306和3305;然后
列表绑定到3306将获得来自应用程序的所有写请求,后端将有节点1,然后是node2和node3作为备份;列表绑定到3305将有来自应用程序的所有读请求,其后端将具有正常指定的所有节点。因此,它的读可伸缩性和写可伸缩性是有限的,其中死锁可以减少到非常大的程度。
https://stackoverflow.com/questions/23057978
复制相似问题