Oracle诊断案例-Spfile案例一则
link:
http://www.eygle.com/case/spfile.htm
情况说明: 系统:SUN Solaris8 数据库版本:9203 问题描述:工程人员报告,数据库在重新启动时无法正常启动.检查发现UNDO表空间丢失. 问题诊断及解决过程如下:
1. 登陆系统检查alert.log文件 检查alert.log文件是通常是我们诊断数据库问题的第一步
SunOS 5.8 login: root Password: Last login: Thu Apr 1 11:39:16 from 10.123.7.162 Sun Microsystems Inc. SunOS 5.8 Generic Patch October 2001 You have new mail. # su - oracle bash-2.03$ cd $ORACLE_BASE/admin/*/bdump bash-2.03$ vi *.log "alert_gzhs.log" 7438 lines, 283262 characters Sat Feb 7 20:30:06 2004 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 SCN scheme 3 Using log_archive_dest parameter default value LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up ORACLE RDBMS Version: 9.2.0.3.0. System parameters with non-default values: processes = 150 timed_statistics = TRUE shared_pool_size = 1157627904 large_pool_size = 16777216 java_pool_size = 637534208 control_files = /u01/oradata/gzhs/control01.ctl, /u02/oradata/gzhs/control02.ctl, /u03/oradata/gzhs/control03.ctl db_block_size = 8192 db_cache_size = 2516582400 compatible = 9.2.0.0.0 log_archive_start = TRUE log_archive_dest_1 = LOCATION=/u06/oradata/gzhs/arch log_archive_format = %t_%s.dbf db_file_multiblock_read_count= 16 fast_start_mttr_target = 300 undo_management = AUTO undo_tablespace = UNDOTBS1 undo_retention = 10800 remote_login_passwordfile= EXCLUSIVE db_domain = instance_name = gzhs dispatchers = (PROTOCOL=TCP) (SERVICE=gzhsXDB) job_queue_processes = 10 hash_join_enabled = TRUE background_dump_dest = /oracle/admin/gzhs/bdump user_dump_dest = /oracle/admin/gzhs/udump core_dump_dest = /oracle/admin/gzhs/cdump sort_area_size = 524288 db_name = gzhs open_cursors = 300 star_transformation_enabled= FALSE query_rewrite_enabled = FALSE pga_aggregate_target = 838860800 aq_tm_processes = 1 PMON started with pid=2 DBW0 started with pid=3 LGWR started with pid=4 CKPT started with pid=5 SMON started with pid=6 "alert_gzhs.log" 7438 lines, 283262 characters USER: terminating instance due to error 30012 Instance terminated by USER, pid = 26433 ORA-1092 signalled during: ALTER DATABASE OPEN... Thu Apr 1 11:11:08 2004 Starting ORACLE instance (normal) LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 SCN scheme 3 Using log_archive_dest parameter default value LICENSE_MAX_USERS = 0 SYS auditing is disabled Starting up ORACLE RDBMS Version: 9.2.0.3.0. System parameters with non-default values: processes = 150 timed_statistics = TRUE shared_pool_size = 1157627904 large_pool_size = 16777216 java_pool_size = 637534208 control_files = /u01/oradata/gzhs/control01.ctl, /u02/oradata/gzhs/control02.ctl, /u03/oradata/gzhs/control03.ctl db_block_size = 8192 db_cache_size = 2516582400 compatible = 9.2.0.0.0 log_archive_start = TRUE log_archive_dest_1 = LOCATION=/u06/oradata/gzhs/arch log_archive_format = %t_%s.dbf db_file_multiblock_read_count= 16 fast_start_mttr_target = 300 undo_management = AUTO undo_tablespace = UNDOTBS1 undo_retention = 10800 remote_login_passwordfile= EXCLUSIVE db_domain = instance_name = gzhs dispatchers = (PROTOCOL=TCP) (SERVICE=gzhsXDB) job_queue_processes = 10 hash_join_enabled = TRUE background_dump_dest = /oracle/admin/gzhs/bdump user_dump_dest = /oracle/admin/gzhs/udump core_dump_dest = /oracle/admin/gzhs/cdump sort_area_size = 524288 db_name = gzhs open_cursors = 300 star_transformation_enabled= FALSE query_rewrite_enabled = FALSE pga_aggregate_target = 838860800 aq_tm_processes = 1 PMON started with pid=2 DBW0 started with pid=3 LGWR started with pid=4 CKPT started with pid=5 SMON started with pid=6 RECO started with pid=7 CJQ0 started with pid=8 Thu Apr 1 11:11:13 2004 starting up 1 shared server(s) ... QMN0 started with pid=9 Thu Apr 1 11:11:13 2004 starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'... ARCH: STARTING ARCH PROCESSES ARC0 started with pid=12 ARC0: Archival started ARC1 started with pid=13 Thu Apr 1 11:11:13 2004 ARCH: STARTING ARCH PROCESSES COMPLETE Thu Apr 1 11:11:13 2004 ARC0: Thread not mounted Thu Apr 1 11:11:13 2004 ARC1: Archival started ARC1: Thread not mounted Thu Apr 1 11:11:14 2004 ALTER DATABASE MOUNT Thu Apr 1 11:11:18 2004 Successful mount of redo thread 1, with mount id 1088380178. Thu Apr 1 11:11:18 2004 Database mounted in Exclusive Mode. Completed: ALTER DATABASE MOUNT Thu Apr 1 11:11:27 2004 alter database open Thu Apr 1 11:11:27 2004 Beginning crash recovery of 1 threads Thu Apr 1 11:11:27 2004 Started first pass scan Thu Apr 1 11:11:28 2004 Completed first pass scan 1 redo blocks read, 0 data blocks need recovery Thu Apr 1 11:11:28 2004 Started recovery at Thread 1: logseq 177, block 2, scn 0.33104793 Recovery of Online Redo Log: Thread 1 Group 3 Seq 177 Reading mem 0 Mem# 0 errs 0: /u01/oradata/gzhs/redo03.log Thu Apr 1 11:11:28 2004 Completed redo application Thu Apr 1 11:11:28 2004 Ended recovery at Thread 1: logseq 177, block 3, scn 0.33124794 0 data blocks read, 0 data blocks written, 1 redo blocks read Crash recovery completed successfully Thu Apr 1 11:11:28 2004 LGWR: Primary database is in CLUSTER CONSISTENT mode Thread 1 advanced to log sequence 178 Thread 1 opened at log sequence 178 Current log# 1 seq# 178 mem# 0: /u01/oradata/gzhs/redo01.log Successful open of redo thread 1. Thu Apr 1 11:11:28 2004 ARC0: Evaluating archive log 3 thread 1 sequence 177 Thu Apr 1 11:11:28 2004 ARC0: Beginning to archive log 3 thread 1 sequence 177 Creating archive destination LOG_ARCHIVE_DEST_1: '/u06/oradata/gzhs/arch/1_177.dbf' Thu Apr 1 11:11:28 2004 SMON: enabling cache recovery ARC0: Completed archiving log 3 thread 1 sequence 177 Thu Apr 1 11:11:28 2004 Errors in file /oracle/admin/gzhs/udump/gzhs_ora_27781.trc: ORA-30012: /263/267/317/373/261/355/277/325/274/344 'UNDOTBS1' /262/273/264/346/324/332/273/362/300/340/320/315/262/273/325/375/310/ 267 Thu Apr 1 11:11:28 2004 Error 30012 happened during db open, shutting down database USER: terminating instance due to error 30012 Instance terminated by USER, pid = 27781 ORA-1092 signalled during: alter database open... :q .............
在警报日志末尾显示了数据库在Open状态因为错误而异常终止.
2. 尝试重新启动数据库
SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 4月 1 11:43:52 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. 已连接到空闲例程。 SQL> startupORACLE 例程已经启动。 Total System Global Area 4364148184 bytesFixed Size 736728 bytesVariable Size 1845493760 bytesDatabase Buffers 2516582400 bytesRedo Buffers 1335296 bytes数据库装载完毕。ORA-01092: ORACLE 例程终止。强行断开连接
工程人员报告的问题重现.
3. 检查数据文件
bash-2.03$ cd /u01/ oradata/gzhs bash-2.03$ ls -l total 55702458 -rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_BILLINGDETAIL.dbf -rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_MAIN.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN10.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN11.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN2.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN3.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN4.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN5.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN6.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN7.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN8.dbf -rw-r----- 1 oracle dba 2097160192 Apr 1 11:44 WAP12_MAIN9.dbf -rw-r----- 1 oracle dba 1073750016 Apr 1 11:44 WAP12_MVIEW.dbf -rw-r----- 1 oracle dba 1073750016 Mar 24 17:15 WAP12_TEMP1.dbf .........................
发现存在文件UNDOTBS2.dbf
4. mount数据库,检查系统参数
......................... |
---|
发现系统没有使用spfile,而初始化参数设置的undo表空间为UNDOTBS1
5. 检查参数文件 这个设置是极其可疑的. 怀疑参数文件和实际数据库设置不符.
6. 再次检查alert文件 查找对于UNDO表空间的操作 第一部分,创建数据库时的信息:
注意,这也是OCP教材上提到的两种创建UNDO表空间的方式之一
第二部分,发现创建UNDOTBS2的记录信息: 第三部分,新的UNDO表空间被应用 Wed Mar 24 20:24:25 2004 ALTER SYSTEM SET undo_tablespace='UNDOTBS2' SCOPE=MEMORY; 我们发现问题就在这里,创建了新的UNDO表空间以后,因为使用的是pfile文件,修改的只对当前实例生效,操作人员忘记了修改pfile文件. 如果使用spfile,缺省的修改范围是both,会同时修改spfile文件,就可以避免以上问题的出现. 第四部分,删除了UNDOTBS1的信息
这样再次重新启动数据库的时候,问题出现了,pfile中定义的UNDOTBS1找不到了,而且操作实在很久以前,没人能回忆起来,甚至无法得知是什么人的操作。
7. 更改pfile,启动数据库 修改undo表空间
########################################### # System Managed Undo and Rollback Segments ########################################### undo_management=AUTO undo_retention=10800 undo_tablespace=UNDOTBS2 .... bash-2.03$ sqlplus "/ as sysdba" SQL*Plus: Release 9.2.0.3.0 - Production on 星期四 4月 1 11:55:11 2004 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. 连接到: Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production PL/SQL Release 9.2.0.3.0 - Production CORE 9.2.0.3.0 Production TNS for Solaris: Version 9.2.0.3.0 - Production NLSRTL Version 9.2.0.3.0 - Production SQL> exit 从Oracle9i Enterprise Edition Release 9.2.0.3.0 - 64bit Production With the Partitioning, OLAP and Oracle Data Mining options JServer Release 9.2.0.3.0 - Production中断开 bash-2.03$
在这里我们可以看到,使用spfile可以免去手工修改pfile文件的麻烦,减少了犯错的可能。 既然Oracle9i给我们提供了这个新特性,就值得我们学习使用它.
本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。
我来说两句