Postgresql备库复制冲突原理

mingjie

发布于 2022-05-12 09:39:37

9880

发布于 2022-05-12 09:39:37

1 总结

报错类似于这样：

FATAL:  terminating connection due to conflict with recovery  
DETAIL:  User query might have needed to see row versions that must be removed.  
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

1.1 起因

报错是备库事务或者单SQL查询时间长，和备库的日志apply发生冲突，如果业务上有长事务、长查询，主库上又再修改同一行数据，很容易造成备库的wal日志无法apply。

wal无法apply数据库有两个策略：

备库告诉主库需要哪些版本，让主库保留，备库查询始终能拿到需要的版本，不阻塞apply，因为备库总能拿到需要的版本
备库apply进入等待，直到备库冲突查询结束，继续apply。可以设置超时时间。max_standby_streaming_delay

1.2 对策

vacuum_defer_cleanup_age > 0

代价1，主库膨胀，因为垃圾版本要延迟若干个事务后才能被回收。

代价2，重复扫描垃圾版本，重复耗费垃圾回收进程的CPU资源。（n_dead_tup会一直处于超过垃圾回收阈值的状态，从而autovacuum 不断唤醒worker进行回收动作）。

当主库的 autovacuum_naptime=很小的值，同时autovacuum_vacuum_scale_factor=很小的值时，尤为明显。

代价3，如果期间发生大量垃圾，垃圾版本可能会在事务到达并解禁后，爆炸性的被回收，产生大量的WAL日志，从而造成WAL的写IO尖刺。

hot_standby_feedback=on

如果备库出现了LONG QUERY，或者Repeatable Read的长事务，并且主库对备库还需要或正查询的数据执行了更新并产生了垃圾时，主库会保留这部分垃圾版本（与vacuum_defer_cleanup_age效果类似）。

代价，与vacuum_defer_cleanup_age > 0 一样。

max_standby_streaming_delay

max_standby_archive_delay和max_standby_streaming_delay

代价，如果备库的QUERY与APPLY（恢复进程）冲突，那么备库的apply会出现延迟，也许从备库读到的是N秒以前的数据。

2 实测case1

PG9.4可复现，PG10已解决

https://github.com/digoal/blog/blob/master/201704/20170410_03.md

备库开hot_standby_feedback，备库起可重复读或以上长事务，主库更新表造成主库CPU IO升高。

参数	值	含义
autovacuum_naptime	1	运行周期默认60s
autovacuum_vacuum_scale_factor	0.005	当表更新或者删除的元组数超过autovacuum_vacuum_threshold+ autovacuum_vacuum_scale_factor* table_size会触发VACUUM，该参数每个表可以单独设置
autovacuum_vacuum_threshold	50	配合autovacuum_vacuum_scale_factor使用
vacuum_defer_cleanup_age	0	代价1，主库膨胀，因为垃圾版本要延迟若干个事务后才能被回收。代价2，重复扫描垃圾版本，重复耗费垃圾回收进程的CPU资源。（n_dead_tup会一直处于超过垃圾回收阈值的状态，从而autovacuum 不断唤醒worker进行回收动作）。
hot_standby_feedback	off	备库查询的一些信息反馈给主库，保证正在查询的数据不会因为主库的更改而失败。如果备库出现了LONG QUERY，或者Repeatable Read的长事务，并且主库对备库还需要或正查询的数据执行了更新并产生了垃圾时，主库会保留这部分垃圾版本（与vacuum_defer_cleanup_age效果类似）。
max_standby_archive_delay
max_standby_streaming_delay

复现

备库hot_standby_feedback=on

drop table test1;
create table test1(id int , info text, crt_time timestamp);  
insert into test1 select 1,md5(random()::text),now() from generate_series(1,10000000);  

--备库
begin transaction isolation level repeatable read;  
select count(*) from test1;

-- 主库
update test1 set info=info;  
select * from pg_stat_all_tables where relname ='test1';

主库发现1000万条dead tuple

CPU单核增加40%左右，IO上升后缓慢降低

处理方法1（缓解）

主库降低vacuum频率

autovacuum_naptime=60

deadtuple还没有被回收掉，但是cpu已经降下来了

t1=# select * from pg_stat_all_tables where relname ='test';
-[ RECORD 1 ]-------+------------------------------
relid               | 32794
schemaname          | public
relname             | test
seq_scan            | 2
seq_tup_read        | 20000000
idx_scan            |
idx_tup_fetch       |
n_tup_ins           | 10000000
n_tup_upd           | 20000000
n_tup_del           | 0
n_tup_hot_upd       | 18
n_live_tup          | 10000000
n_dead_tup          | 10000000
n_mod_since_analyze | 0
last_vacuum         |
last_autovacuum     | 2018-11-14 19:53:57.14633+08
last_analyze        |
last_autoanalyze    | 2018-11-14 18:48:13.783304+08
vacuum_count        | 0
autovacuum_count    | 1805
analyze_count       | 0
autoanalyze_count   | 3

处理方法2

备库关闭hot_standby_feedback=off，长事务还在

deadtuple立即被回收

t1=# select * from pg_stat_all_tables where relname ='test';
-[ RECORD 1 ]-------+------------------------------
relid               | 32794
schemaname          | public
relname             | test
seq_scan            | 2
seq_tup_read        | 20000000
idx_scan            |
idx_tup_fetch       |
n_tup_ins           | 10000000
n_tup_upd           | 20000000
n_tup_del           | 0
n_tup_hot_upd       | 18
n_live_tup          | 10000000
n_dead_tup          | 0
n_mod_since_analyze | 0
last_vacuum         |
last_autovacuum     | 2018-11-14 20:02:02.450611+08
last_analyze        |
last_autoanalyze    | 2018-11-14 18:48:13.783304+08
vacuum_count        | 0
autovacuum_count    | 1813
analyze_count       | 0
autoanalyze_count   | 3

3 实测case2

--主库
create table test1(id int , info text, crt_time timestamp);
insert into test1 select 1,md5(random()::text),now() from generate_series(1,1000);

-- 备库
begin;
select count(*) from test1;

-- 主库
delete from test1 ;

-- 在备库ps进程发现waiting
-- postgres: startup process   recovering 000000010000000800000075 waiting

-- 备库
select count(*) from test1;

FATAL:  terminating connection due to conflict with recovery
DETAIL:  User was holding a relation lock for too long.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

原因

主库delete后，触发autovacuum操作，自动下做了一次truncate操作：

看下WAL日志发现：

DELETE off 28 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 29 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 30 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 31 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 32 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 33 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 34 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 35 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 36 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
DELETE off 37 KEYS_UPDATED , blkref #0: rel 1663/13212/18071 blk 9
......
TRUNCATE base/13212/18071 to 0 blocks flags 7
......

主库下发了一条truncate，与备库事务冲突，startup process进程开始waiting

...
postgres: startup process   recovering 000000010000000800000075 waiting
...
postgres: wal receiver process   streaming 8/75724CF8
psql -p 3004 postgres
postgres: pg10192893 postgres [local] idle in transaction

超时之后，备库事务报错

FATAL:  terminating connection due to conflict with recovery
DETAIL:  User was holding a relation lock for too long.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

其实这样的报错在主库执行一些DDL很容易出来（执行和备库事务内锁冲突的语句）autovacuum触发truncate

本文参与腾讯云自媒体分享计划，分享自作者个人站点/博客。

原始发表：2020-04-01，如有侵权请联系 cloudcommunity@tencent.com 删除

apply