前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >ORA-15040 ora-15017ASM磁盘无法挂载故障处理

ORA-15040 ora-15017ASM磁盘无法挂载故障处理

原创
作者头像
PHP开发工程师
修改2021-05-10 18:08:55
1.3K0
修改2021-05-10 18:08:55
举报
文章被收录于专栏:thinkphp+vue

测试环境突然一个节点的数据库无法启动,报错很清楚,找不到控制文件,而控制文件在ASM上,第一个节点正常,下面是处理流程。

数据库启动报错

SQL> startup

ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance

ORACLE instance started.

Total System Global Area 2684354560 bytes

Fixed Size 8624312 bytes

Variable Size 989857608 bytes

Database Buffers 1677721600 bytes

Redo Buffers 8151040 bytes

ORA-00205: error in identifying control file, check alert log for more info

告警日志:

alter database mount

2021-04-22T15:06:27.293867+08:00

NOTE: ASMB mounting group 3 (USERDATA)

NOTE: ASM background process initiating disk discovery for grp 3 (reqid:0)

WARNING: group 3 (USERDATA) has missing disks

ORA-15040: diskgroup is incomplete

WARNING: group 3 is being dismounted.

WARNING: ASMB force dismounting group 3 (USERDATA) due to missing disks

SUCCESS: diskgroup USERDATA was dismounted

2021-04-22T15:06:27.330633+08:00

ORA-00210: cannot open the specified control file

ORA-00202: control file: '+USERDATA/PROD/CONTROLFILE/current.257.1055783219'

ORA-17503: ksfdopn:2 Failed to open file +USERDATA/PROD/CONTROLFILE/current.257.1055783219

ORA-15001: diskgroup "USERDATA" does not exist or is not mounted

ORA-15040: diskgroup is incomplete

ORA-205 signalled during: alter database mount...

磁盘组状态

SQL> select name,state,total_mb,type from v$asm_diskgroup

NAME STATE TOTAL_MB TYPE

-------------------- -------------------- ---------- ------------------

ACFSDATA MOUNTED 12224 EXTERN

DATA MOUNTED 51196 EXTERN

USERDATA MOUNTED 20476 EXTERN

查看磁盘组的状态

SQL>select b.name,path,a.state,a.mount_status,a.header_status from v$asm_disk a,v$asm_diskgroup b where a.group_number=b.group_number

NAME PATH STATE MOUNT_STATUS HEADER_STATUS

-------------------- -------------------- -------------------- --------------------- ------------------------------------

USERDATA /dev/sdd1 NORMAL CACHED MEMBER

ACFSDATA /dev/sdb1 NORMAL CACHED MEMBER

DATA /dev/sdc1 NORMAL CACHED MEMBER

查看磁盘信息

Disk /dev/sdd: 21.5 GB, 21474836480 bytes, 41943040 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk label type: dos

Disk identifier: 0x7bf56a92

Device Boot Start End Blocks Id System

/dev/sdd1 2048 41943039 20970496 83 Linux

尝试重新mount磁盘组

SQL> alter diskgroup usedata mount;

alter diskgroup usedata mount

*

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15017: diskgroup "USEDATA" cannot be mounted

ORA-15040: diskgroup is incomplete

下面尝试mount正常的磁盘组

SQL> alter diskgroup ACFSDATA mount;

alter diskgroup ACFSDATA mount

*

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15017: diskgroup "ACFSDATA" cannot be mounted

ORA-15013: diskgroup "ACFSDATA" is already mounted

显然二者的提示不同,问题磁盘组报错15040,下马是是该错误解释

[oracle@rac2 ~]$ oerr ora 15040

15040, 00000, "diskgroup is incomplete"

// *Cause: Some of the disks comprising a diskgroup were not present.

// *Action: Check the hardware to ensure that all disks are functional. Also

// check that the setting of the ASM_DISKSTRING initialization

// parameter has not changed. Alternatively, for normal or high

// redundancy diskgroups, use MOUNT FORCE to offline missing disks

// as part of mounting the diskgroup.

找不到ASM磁盘了,但是第一个节点该磁盘组是正常的,因为只有一个ASM磁盘,我们排除是ASM磁盘头的问题

应该是第二个节点与操作系统之间有问题,也就是实例2无法识别这个ASM磁盘

节点2 userdata磁盘组dismount后的状态

SQL> l

1* select b.name,path,a.state,a.mount_status,a.header_status from v$asm_disk a,v$asm_diskgroup b where a.group_number=b.group_number

SQL> /

NAME PATH STATE MOUNT_STATUS HEADER_STATUS

-------------------- -------------------- ------------------------ --------------------- ------------------------------------

USERDATA /dev/sde1 NORMAL CLOSED CANDIDATE

USERDATA /dev/sdg1 NORMAL CLOSED CANDIDATE

USERDATA /dev/sdg NORMAL CLOSED CANDIDATE

USERDATA /dev/sdb NORMAL CLOSED CANDIDATE

USERDATA /dev/sdc NORMAL CLOSED CANDIDATE

USERDATA /dev/sdf NORMAL CLOSED CANDIDATE

USERDATA /dev/sde NORMAL CLOSED CANDIDATE

USERDATA /dev/sda2 NORMAL CLOSED CANDIDATE

USERDATA /dev/sdd NORMAL CLOSED CANDIDATE

USERDATA /dev/sda1 NORMAL CLOSED CANDIDATE

ACFSDATA /dev/sdb1 NORMAL CACHED MEMBER

DATA /dev/sdc1 NORMAL CACHED MEMBER

12 rows selected.

节点2多了很多candidate盘,并且磁盘组userdata没有/dev/sdd1的盘为member状态。

节点1 userdata磁盘组dismount后的状态(该节点正常)

select 'alter diskgroup '||b.name||' drop disk '||path from v$asm_disk a,v$asm_diskgroup b where a.group_number=b.group_number;

SQL> select b.name,path,a.state,a.mount_status,a.header_status from v$asm_disk a,v$asm_diskgroup b where a.group_number=b.group_number

NAME PATH STATE MOUNT_STATUS HEADER_STATUS

-------------------- ------------------------------ ------------------------ --------------------- ------------------------------------

USERDATA /dev/sdd1 NORMAL CLOSED MEMBER

USERDATA /dev/sdf1 NORMAL CLOSED FORMER

ACFSDATA /dev/sdb1 NORMAL CACHED MEMBER

DATA /dev/sdc1 NORMAL CACHED MEMBER

显然两个节点对应的设备不同,从candidate含义可以知道多了备选的ASM磁盘,于是查询rules规则文件

[oracle@rac2 rules.d]$ ls -lrt *asm*

-rw-r--r-- 1 root root 712 Nov 1 13:16 99-my-asmdevices.rules

-rw-r--r-- 1 root root 1684 Apr 20 17:09 99-oracle-asmdevices.rules

发现节点2有两个规则,文件,突然记起来之前测试了一个自动生成rules文件的脚本,脚本内容如下,也就解释了为何出现这个问题。

[oracle@rac2 rules.d]$ cat 99-oracle-asmdevices.rules

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VBf0c224c2-df949031", SYMLINK+="asm_sdb", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VB6696fa7f-9e3b849f", SYMLINK+="asm_sdc", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VBd23c9420-9c9f27e4", SYMLINK+="asm_sdd", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VBfd0b0c16-d27f68c4", SYMLINK+="asm_sde", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VBa7ed9674-7578ffc6", SYMLINK+="asm_sdf", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="1ATA_VBOX_HARDDISK_VBed0c7e0c-483cfad3", SYMLINK+="asm_sdg", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="", SYMLINK+="asm_sdh", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

KERNEL=="sd*", ENV{DEVTYPE}=="disk", SUBSYSTEM=="block", PROGRAM=="/lib/udev/scsi_id -g -u -d $devnode",

RESULT=="", SYMLINK+="asm_sdi", OWNER="grid", GROUP="asmadmin", MODE=

"0660"

问题好解决了,删除最后的那个规则文件

再次查询节点2

SQL> l

1* select b.name,path,a.state,a.mount_status,a.header_status from v$asm_disk a,v$asm_diskgroup b where a.group_number=b.group_number

SQL> /

NAME PATH STATE MOUNT_STATUS HEADER_STATUS

------------------------------ -------------------- ------------------------ --------------------- ------------------------------------

ACFSDATA /dev/sdb1 NORMAL CACHED MEMBER

USERDATA /dev/sdd1 NORMAL CACHED MEMBER

DATA /dev/sdc1 NORMAL CACHED MEMBER

下面重启下集群,重新挂载ASM磁盘组

看节点2数据库告警日志

ALTER DATABASE MOUNT /* db agent *//* {2:10468:2} */

2021-04-22T15:41:43.380715+08:00

This instance was first to mount

2021-04-22T15:41:44.382690+08:00

NOTE: ASMB mounting group 3 (USERDATA)

NOTE: ASM background process initiating disk discovery for grp 3 (reqid:0)

NOTE: Assigning number (3,0) to disk (/dev/sdd1)

SUCCESS: mounted group 3 (USERDATA)

NOTE: grp 3 disk 0: USERDATA_0000 path:/dev/sdd1

2021-04-22T15:41:44.719410+08:00

NOTE: dependency between database prod and diskgroup resource ora.USERDATA.dg is established

这里提示成功挂载ASM磁盘

集群状态:

[oracle@rac1 ~]$ crsctl stat res -t

--------------------------------------------------------------------------------

Name Target State Server State details

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.ACFSDATA.dg

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.ASMNET1LSNR_ASM.lsnr

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.DATA.dg

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.LISTENER.lsnr

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.USERDATA.dg

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.net1.network

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

ora.ons

ONLINE ONLINE rac1 STABLE

ONLINE ONLINE rac2 STABLE

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

1 ONLINE ONLINE rac1 STABLE

ora.MGMTLSNR

1 OFFLINE OFFLINE STABLE

ora.asm

1 ONLINE ONLINE rac1 Started,STABLE

2 ONLINE ONLINE rac2 Started,STABLE

3 OFFLINE OFFLINE STABLE

ora.cvu

1 ONLINE ONLINE rac1 STABLE

ora.mgmtdb

1 OFFLINE OFFLINE STABLE

ora.prod.db

1 ONLINE ONLINE rac1 Open,HOME=/oracle/db

/base/product/12.2,S

TABLE

2 ONLINE ONLINE rac2 Open,HOME=/oracle/db

/base/product/12.2,S

TABLE

ora.qosmserver

1 ONLINE ONLINE rac1 STABLE

ora.rac1.vip

1 ONLINE ONLINE rac1 STABLE

ora.rac2.vip

1 ONLINE ONLINE rac2 STABLE

ora.scan1.vip

1 ONLINE ONLINE rac1 STABLE

--------------------------------------------------------------------------------

提示:不要随便在运行库测试脚本,造成生成的规则文件覆盖之前的文件,造成一个节点的数据库无法启动。

原文链接:http://github.crmeb.net/u/defu

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
数据库
云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档