前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >联机日志文件过小引发的log file 相关等待

联机日志文件过小引发的log file 相关等待

作者头像
Leshami
发布2018-08-14 10:01:57
3990
发布2018-08-14 10:01:57
举报
文章被收录于专栏:乐沙弥的世界乐沙弥的世界

      Oracle 联机重做日志文件记录了数据库的所有变化(DML,DDL或管理员对数据所作的结构性更改等),用于对于意外删除或宕机利用日志文件实现数据恢复来确保数据的完整性。但不合理的联机日志文件规划将引发日志相关的等待事件。下面是这样一个来自生产环境中的例子。

1、故障描述

代码语言:javascript
复制
--客户描述该数据库晚上用于实现数据同步以及汇总,以前一直工作的比较良好,随着需要同步的数量量的增大,最近变得越来越慢。
--下面我们首先取了客户晚8点至第二天7点的awr report。

WORKLOAD REPOSITORY report for

DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
ST990         2152526631 ST990               1 10.2.0.3.0  NO  v2011db02p

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     21787 21-Feb-13 20:00:22        50      19.5
  End Snap:     21798 22-Feb-13 07:00:47        44      20.0
   Elapsed:              660.42 (mins)
   DB Time:              928.06 (mins)

--从上面的awr report 可知,单实例,10.2.0.3版本,期间的会话数也不多
--Elapsed < DB Time
--Elapsed Time=(20130222 07:00:00 - 20130221 20:00:00)≈ 660
--DB Time=928.06 ,运行环境为16核CPU, 660*16=10560, cpu花费了928.06分钟在处理Oralce非空闲等待和运算上
--从上可知,整个系统还是比较空闲

--下面来看看top event
Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time                                         20,673          37.1
log file parallel write              27,399       4,797    175    8.6 System I/O
control file parallel write          13,428       4,688    349    8.4 System I/O
log file sync                        19,564       3,795    194    6.8     Commit
db file scattered read           26,651,537       3,439      0    6.2   User I/O

--从上面的top event事件上来看,log file相关等待事件表现明显
--log file parallel write等待事件总等待次数27,399 总等待时间4,797/60=79.95(min),超出一个小时,相当可观
--其次是control file parallel write与log file sync事件的相关等待

--下面是等待事件的detail信息
Wait Events                          DB/Inst: ST1200/ST1200  Snaps: 21787-21798
-> s  - second
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
log file parallel write              27,399     .0       4,797     175       1.1
control file parallel write          13,428     .0       4,688     349       0.5
log file sync                        19,564   10.6       3,795     194       0.8
db file scattered read           26,651,537     .0       3,439       0   1,049.4
db file sequential read           6,682,373     .0       1,567       0     263.1
log file switch (checkpoint           1,091   92.9       1,019     934       0.0
Datapump dump file I/O              633,458     .0         286       0      24.9
log file switch completion              332   31.6         183     552       0.0
log buffer space                        255   47.8         155     608       0.0
free buffer waits                     2,409   99.5         120      50       0.1
buffer busy waits                       145   62.8          96     664       0.0

2、分析故障

代码语言:javascript
复制
--客户描述该数据库晚上用于实现数据同步以及汇总,以前一直工作的比较良好,随着需要同步的数量量的增大,最近变得越来越慢。
--下面我们首先取了客户晚8点至第二天7点的awr report。

WORKLOAD REPOSITORY report for

DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
ST990         2152526631 ST990               1 10.2.0.3.0  NO  v2011db02p

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     21787 21-Feb-13 20:00:22        50      19.5
  End Snap:     21798 22-Feb-13 07:00:47        44      20.0
   Elapsed:              660.42 (mins)
   DB Time:              928.06 (mins)

--从上面的awr report 可知,单实例,10.2.0.3版本,期间的会话数也不多
--Elapsed < DB Time
--Elapsed Time=(20130222 07:00:00 - 20130221 20:00:00)≈ 660
--DB Time=928.06 ,运行环境为16核CPU, 660*16=10560, cpu花费了928.06分钟在处理Oralce非空闲等待和运算上
--从上可知,整个系统还是比较空闲

--下面来看看top event
Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
CPU time                                         20,673          37.1
log file parallel write              27,399       4,797    175    8.6 System I/O
control file parallel write          13,428       4,688    349    8.4 System I/O
log file sync                        19,564       3,795    194    6.8     Commit
db file scattered read           26,651,537       3,439      0    6.2   User I/O

--从上面的top event事件上来看,log file相关等待事件表现明显
--log file parallel write等待事件总等待次数27,399 总等待时间4,797/60=79.95(min),超出一个小时,相当可观
--其次是control file parallel write与log file sync事件的相关等待

--下面是等待事件的detail信息
Wait Events                          DB/Inst: ST1200/ST1200  Snaps: 21787-21798
-> s  - second
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)

                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
log file parallel write              27,399     .0       4,797     175       1.1
control file parallel write          13,428     .0       4,688     349       0.5
log file sync                        19,564   10.6       3,795     194       0.8
db file scattered read           26,651,537     .0       3,439       0   1,049.4
db file sequential read           6,682,373     .0       1,567       0     263.1
log file switch (checkpoint           1,091   92.9       1,019     934       0.0
Datapump dump file I/O              633,458     .0         286       0      24.9
log file switch completion              332   31.6         183     552       0.0
log buffer space                        255   47.8         155     608       0.0
free buffer waits                     2,409   99.5         120      50       0.1
buffer busy waits                       145   62.8          96     664       0.0

3、几个log file 事件 log file parallel write

The log file parallel write wait event has three parameters: files, blocks, and requests. In Oracle Database 10g, this wait event falls under the System I/O wait class. Keep the following key thoughts in mind when dealing with the log file parallel write wait event.

    The log file parallel write event belongs only to the LGWR process.     A slow LGWR can impact foreground processes commit time.     Significant log file parallel write wait time is most likely an I/O issue

log file sync

The log file sync wait event has one parameter: buffer#. In Oracle Database 10g, this wait event falls under the Commit wait class. Keep the following key thoughts in mind when dealing with the log file sync wait event.

    The log file sync wait event is related to transaction terminations (commits or rollbacks).

    When a process spends a lot of time on the log file sync event, it is usually indicative of too many commits or short transactions.

The log file switch (checkpoint incomplete) wait event has no wait parameters.

In Oracle Database 10g, this wait event falls under the Configuration wait class. Keep the following key thought in mind when dealing with the log file switch (checkpoint incomplete) wait event.

    Excessive log switches caused by small log files and a high transaction rate

更多的知识点可以参考 Oracle Wait Interface: A Practical Guide to Performance Diagnostics & Tuning

4、建议与解决方案  a、从上面的分析以及日志相关等待事件的解释来看,首要的是增加日志文件的大小(200-250MB)。可参考:调整联机重做日志大小(change redo log size)  b、日志文件组太多,建议减少到4-5组  c、可能的情形下,将日志存放到高速磁盘(目前是raid 5上),如存放到raid 0之上  d、采用批量提交的方式来提交事务  e、建议增加DBWn的数目

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2013年03月25日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
数据库
云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档