Postgresql中缓冲块的状态操作是非常频繁的,尤其是pin/unpin的操作。
这类操作类似于引用计数,具体由int类型记录:
buf_state += BUF_USAGECOUNT_ONE
buf_state -= BUF_USAGECOUNT_ONE
这类+1的动作是concurrency unsafe的,需要使用锁或原子操作来保护。
(背景知识)
早期PG使用spin lock实现pin的自增操作,并发性能比较差。
PinBuffer
...
...
LockBufHdr(buf);
buf->refcount++;
...
...
if (buf->usage_count < BM_MAX_USAGE_COUNT)
buf->usage_count++;
result = (buf->flags & BM_VALID) != 0;
UnlockBufHdr(buf);
使用CAS函数pg_atomic_compare_exchange_u32来做check&swap,兼顾原子性与性能(高并发readonly场景有8倍的性能提升)。
old_buf_state = pg_atomic_read_u32(&buf->state);
for (;;)
{
if (old_buf_state & BM_LOCKED)
old_buf_state = WaitBufHdrUnlocked(buf);
buf_state = old_buf_state;
/* increase refcount */
buf_state += BUF_REFCOUNT_ONE;
/* increase usagecount unless already max */
if (BUF_STATE_GET_USAGECOUNT(buf_state) != BM_MAX_USAGE_COUNT)
buf_state += BUF_USAGECOUNT_ONE;
if (pg_atomic_compare_exchange_u32(&buf->state, &old_buf_state,
buf_state))
{
result = (buf_state & BM_VALID) != 0;
break;
}
}
PG的CAS函数惯用法:
原子函数拿到OLD
while(1)
用拿到的OLD做一些更新,记录到NEW,但不能修改OLD
if (pg_atomic_compare_exchange_u32(共享变量,OLD,NEW))
break;