关于 Private strand flush not complete

      网友发来告警日志,原本是关于一个死锁的情形,而另外的一个问题则是从redo log buffer写出到redo log file出现了不能分配新的日志,Private strand flush not complete的等待事件。这是个和redo log相关的话题,从Meatlink也找到了对此的描述如下文。

1、错误消息

Tue Sep 24 14:27:48 2013 Thread 1 cannot allocate new log, sequence 22120 Private strand flush not complete   Current log# 4 seq# 22119 mem# 0: /u01/app/oracle/oradata/orcl/redo04.log

2、Meatlink 对此的描述(Doc ID 372557.1)

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.3 [Release 10.2 to 11.2] Information in this document applies to any platform. Private strand flush not complete

Symptoms

"Private strand flush not complete" messages are being populated to the alert log, example: Mon Jan 23 16:09:36 2012 Thread 1 cannot allocate new log, sequence 18358 Private strand flush not complete Current log# 7 seq# 18357 mem# 0: /u03/oradata/bitst/redo07.log Thread 1 advanced to log sequence 18358 Current log# 8 seq# 18358 mem# 0: /u03/oradata/bitst/redo08.log

Changes

When you switch logs all private strands have to be flushed to the current log before the switch is allowed to proceed.

--切换日值前,所有的private strands必须写入到当前的redo logfile

Cause

The message means that we haven't completed writing all the redo information to the log when we are trying to switch. It is similar in nature to a "checkpoint not complete" except that is only involves the redo being written to the log. The log switch can not occur until all of the redo has been written. -->该消息意味着在日志切换前我们不能够完整的写出redo到日志文件。其本质类似于checkpoint not complete等待事件。所不同的是它仅仅涉及到正在被写入到日志的redo

A "strand" is new terminology for 10g and it deals with latches for redo .    -->strand是一个用于处理redo latch的新术语 Strands are a mechanism to allow multiple allocation latches for processes to write redo more efficiently in the redo buffer and is related to the log_parallelism parameter present in 9i. The concept of a strand is to ensure that the redo generation rate for an instance is optimal and that when there is some kind of redo contention then the number of strands is dynamically adjusted to compensate.

-->最大的作用是用于确保redo产生的速率达到最佳,并在出现相关redo竞争的时候动态调整strand的值进行补偿 The initial allocation for the number of strands depends on the number of CPU's and is started with 2 strands with one strand for active redo generation. For large scale enterprise systems the amount of redo generation is large and hence these strands are *made active* as and when the foregrounds encounter this redo contention (allocated latch related contention) when this concept of dynamic strands comes into play. There is always shared strands and a number of private strands . Oracle 10g has some major changes in the mechanisms for redo (and undo), which seem to be aimed at reducing contention.

-->在10g中有很大的变化,最主要的目的还是为了减少竞争 Instead of redo being recorded in real time, it can be recorded 'privately' and pumped into the redo log buffer on commit. Similarly the undo can be generated as 'in memory undo' and applied in bulk. This affect the memory used for redo management and the possibility to flush it in pieces. The message you get is related to internal Cache Redo File management. ...You can disregard these messages as normal messages.      --->可以当作常规消息被忽略

Solution

These messages are not a cause for concern unless there is a significant time gap between the "cannot allocate new log" message and the "advanced to log sequence" message.   --->如果"cannot allocate new log" 与"advanced to log sequence"有明显的时间间隔,应考虑增加db_writer_processes

Increasing the value for db_writer_processes can in some situations help to avoid the message from being generated. Why, because one of the DBWR main function is to keep the buffer cache clean by writing out dirty buffer blocks. So having multiple db_writer_processes should be able to produce a higher throughput.

Finally, these messages have also been seen when there are issues with the storage side or network for the archive log destination, as this leads to delay or hang in LGWR switch.

3、延伸思考

      在高并发,多用户的数据库系统中,所有客户端进程都是通过向redo log buffer写入重做数据来确保数据的完整与一致性。对于redo log buffer的管理,则通过latch的机制来实现。和redo相关的latch主要有两个,一个是redo allocation latch,一个是redo copy latch。前者负责将为新的redo在redo log buffer中分配空间,后者则是pga中的redo复制到redo log buffer。下面是描述一下redo产生的流程。

用户进程产生redo(位于PGA中)====>服务器进程获取Redo Copy latch(存在多个取决于CPU_COUNT*2)====>服务进程获取redo allocation latch(仅1个)====>分配log buffer====>释放redo allocation latch====>将Redo Entry写入Log Buffer====>释放Redo Copy latch

      如前文Doc ID 372557.1所述,Oracle 9.2之后引入了log_parallelism机制,当该参数的值大于1的时候,数据库会分配多个共享的redo log buffer,也就是说redo log buffer被再次细分,使得每个共享的buffer使用独立的redo allocation latch来进行保护以提高redo的并发性。这些个共享的redo log buffer就被称之为 shared strand。在10gR2以后了又多出了一个private strand,这个东东是从shared pool中分配而不是先前的log buffer。private strand为大量小的私有内存,通常每个大小在64kb-128kb左右,被独立的redo allocation latch所保护。每个特定的小事务会绑定到独立且空闲的private redolog strand,即绑定到一个活动事务。在这种新机制引入后,一旦用户进程申请到private strand,redo不再保存的pga中,因此不再需要redo copy latch这个过程。如果新事务申请不到private strand的redo allocation latch,则会继续遵循旧的redo buffer机制,申请写入shared strand中。由于新机制的引入,相应的redo的产生发生了一些变化,如下:

新事务开始====>申请private strand的redo allocation latch(申请失败则申请shared strand的redo allocation latch)====>在private strand中生产redo Entry====>flush/commit====>申请redo copy latch====>LGWR将redo entry批量写入log File====>释放redo copy latch====>释放Private strand的redo allocation latch

      对于这个新的机制,在进行redo被写出到logfile时,LGWR需要将shared strand与private strand的内容写出。当redo flush发生时,所有的publicredo allocation latch需要被获取,所有的public strands的redo copy latch需要被检查,所有包含活动事务的private strands需要被持有。

      由上可知,Private strand flush not complete事件的出现是通过增加参数DBWn的值来避免。因为DBWn会触发LGWR将redo写入到logfile。

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏云计算教程系列

Linux命令行小贴士

本文内容需要一台已经设置好可以使用sudo命令的非root账号的Ubuntu服务器,并且已开启防火墙。没有服务器的同学可以在这里购买,不过我个人更推荐您使用免费...

24520
来自专栏技术博文

linux ss命令使用详解

ss是Socket Statistics的缩写。顾名思义,ss命令可以用来获取socket统计信息,它可以显示和netstat类似的内容。但ss的优势在于它能够...

51760
来自专栏架构技术

【转】如何将MySQL数据目录更改为CentOS 7上的新位置

数据库随着时间的推移而增长,有时超过了文件系统的空间。当它们与操作系统的其他部分位于同一分区上时,也可能遇到I / O争用。RAID,网络块存储和其他设备可以提...

11030
来自专栏Java架构师历程

3、进程间通信

本书主要介绍如何使用微服务架构构建应用程序,这是本书的第三章。第一章介绍了微服务架构模式,将其与单体架构模式进行对比,并讨论了使用微服务的优点与缺点。第二章描述...

12620
来自专栏程序员互动联盟

【专业技术】Linux设备驱动第八篇:高级字符驱动操作之设备存取控制

上一篇中介绍了阻塞IO等的一些用法,本来这一篇准备介绍一下poll/select等的一些高级IO操作,后来想想,在实际工作中开发驱动的时候很少会使用到poll/...

360130
来自专栏蜉蝣禅修之道

oracle用户类型区别

20130
来自专栏xingoo, 一个梦想做发明家的程序员

程序猿的日常——SpringMVC系统架构与流程回顾

web开发经历了很漫长的时间,在国内也快有十几年的时间了。从最开始的进程级到现在的MVC经历了很多的改进和优化,本篇就主要复习了解下Spring MVC相关的...

22170
来自专栏Java技术栈

SpringCloud Eureka自我保护机制

自我保护背景 首先对Eureka注册中心需要了解的是Eureka各个节点都是平等的,没有ZK中角色的概念, 即使N-1个节点挂掉也不会影响其他节点的正常运行。...

413100
来自专栏FreeBuf

有趣的安全实验:利用多线程资源竞争技术上传shell

通过多线程资源竞争的手段同时上传两个头像,就可以在Apache+Rails环境下实现远程代码执行。这并不是天方夜谭,同时我相信许多文件上传系统都会有这个漏洞……...

26350
来自专栏我叫刘半仙

SpringBoot+JWT+Shiro+MybatisPlus实现Restful快速开发后端脚手架

前后端分离已经成为互联网项目开发标准,它会为以后的大型分布式架构打下基础。SpringBoot使编码配置部署都变得简单,越来越多的互联网公司已经选择Spring...

1.3K130

扫码关注云+社区

领取腾讯云代金券