最近在给客户基于Suse 11 sp3安装Oracle 10g RAC,在安装完clusterware执行/u01/app/crs/root.sh时收到错误提示,Failed to upgrade Oracle Cluster Registry configuration由于当前的环境使用了多路径,从Oracle的描述来看,这是一个Oracle Bug(4679769),如果你有相同的问题,请接着往下看。 一、故障现象 suse11a:/u01/app/crs # /u01/app/crs/root.sh WARNING: directory '/u01/app' is not owned by root Checking to see if Oracle CRS stack is already configured /etc/oracle does not exist. Creating it now. Setting the permissions on OCR backup directory Setting up NS directories Failed to upgrade Oracle Cluster Registry configuration #此处为错误提示 #下面使用clsfmt命令时提示Received unexpected error,注,/u01/app/crs 为ORA_CRS_HOME。 suse11a:/ # /u01/app/crs/bin/clsfmt ocr /dev/raw/raw1 clsfmt: Received unexpected error 4 from skgfifi skgfifi: Additional information: -2 Additional information: 1073741824 #下面是具体的错误日志 suse11a:/u01/app/crs/log/suse11a/client # pwd /u01/app/crs/log/suse11a/client suse11a:/u01/app/crs/log/suse11a/client # more ocrconfig_24066.log Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved. 2014-08-11 11:52:14.993: [ OCRCONF][2176517888]ocrconfig starts... 2014-08-11 11:52:14.994: [ OCRCONF][2176517888]Upgrading OCR data 2014-08-11 11:52:15.100: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT 2014-08-11 11:52:15.101: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT 2014-08-11 11:52:15.101: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22 2014-08-11 11:52:15.102: [ default][2176517888]a_init:7!: Backend init unsuccessful : [22] 2014-08-11 11:52:15.102: [ OCRCONF][2176517888]Exporting OCR data to [OCRUPGRADEFILE] 2014-08-11 11:52:15.102: [ OCRAPI][2176517888]a_init:7!: Backend init unsuccessful : [33] 2014-08-11 11:52:15.102: [ OCRCONF][2176517888]There was no previous version of OCR. error:[PROC-33: Oracle Cluster Registry is not configured] 2014-08-11 11:52:15.108: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT 2014-08-11 11:52:15.108: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT 2014-08-11 11:52:15.108: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22 2014-08-11 11:52:15.108: [ default][2176517888]a_init:7!: Backend init unsuccessful : [22] 2014-08-11 11:52:15.113: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT 2014-08-11 11:52:15.113: [ OCRRAW][2176517888]ibctx:1:ERROR: INVALID FORMAT 2014-08-11 11:52:15.113: [ OCRRAW][2176517888]proprinit:problem reading the bootblock or superbloc 22 2014-08-11 11:52:15.118: [ OCRRAW][2176517888]propriogid:1: INVALID FORMAT 2014-08-11 11:52:15.126: [ OCRRAW][2176517888]propriowv: Vote information on disk 0 [/dev/raw/raw1] is adjusted from [0/0] to [2/2] 2014-08-11 11:52:15.137: [ OCRRAW][2176517888]propriniconfig:No 92 configuration 2014-08-11 11:52:15.137: [ OCRAPI][2176517888]a_init:6a: Backend init successful 2014-08-11 11:52:15.165: [ OCRCONF][2176517888]Initialized DATABASE keys in OCR 2014-08-11 11:52:15.176: [ OCRCONF][2176517888]csetskgfrblock0: clsfmt returned with error [4]. 2014-08-11 11:52:15.176: [ OCRCONF][2176517888]Failure in setting block0 [-1] 2014-08-11 11:52:15.176: [ OCRCONF][2176517888]OCR block 0 is not set ! 2014-08-11 11:52:15.176: [ OCRCONF][2176517888]Exiting [status=failed]... 二、解决故障 #由于该故障是使用多路径时产生的一个Bug,因此直接参考DocID 466673.1予以解决 #下面是下载补丁4679769之后步骤 suse11a:/robin # unzip p4679769_10201_Linux-x86-64.zip #解压补丁 Archive: p4679769_10201_Linux-x86-64.zip creating: 4679769/ inflating: 4679769/README.txt inflating: 4679769/clsfmt.bin suse11a:/robin # cp /u01/app/crs/bin/clsfmt.bin /u01/app/crs/bin/clsfmt.bin.bak suse11a:/robin # cp ./4679769/clsfmt.bin /u01/app/crs/bin/clsfmt.bin #覆盖原文件(注该操作仅在安装节点执行即可) suse11a:/robin # chmod 755 /u01/app/crs/bin/clsfmt.bin #授予权限 suse11a:/robin # /u01/app/crs/bin/clsfmt.bin ocr /dev/raw/raw1 #使用clsfmt.bin验证成功 # Author : Leshami # Blog : http://blog.csdn.net/leshami #下面使用dd命令清除ocr 与votingdisk 磁盘(当前的2个裸设备大小为1G) #注意一定要dd,否则root.sh依旧不能成功 suse11a:~ # dd if=/dev/zero of=/dev/raw/raw1 bs=1024k count=800 800+0 records in 800+0 records out 838860800 bytes (839 MB) copied, 2.64104 s, 318 MB/s suse11a:~ # dd if=/dev/zero of=/dev/raw/raw2 bs=1024k count=800 800+0 records in 800+0 records out 838860800 bytes (839 MB) copied, 3.21852 s, 261 MB/s #再次使用clsfmt.bin验证成功 clsfmt: successfully initialized file /dev/raw/raw1 suse11a:/robin # /u01/app/crs/bin/clsfmt.bin ocr /dev/raw/raw2 clsfmt: successfully initialized file /dev/raw/raw2 #再次自行root.sh成功 suse11a:/robin # /u01/app/crs/root.sh 三、DocID 466673.1 APPLIES TO: Oracle Database - Enterprise Edition - Version 10.2.0.1 and later Linux x86 IBM: Linux on POWER Systems Linux x86-64 Linux Itanium ***Checked for relevance on 11-Mar-2013*** SYMPTOMS On a new clusterware installation on Linux root.sh script is failing with the following error while running root.sh on the first node: PROT-1: Failed to initialize ocrconfig Failed to upgrade Oracle Cluster Registry configuration The problem can be tracked down to clsfmt command: ./clsfmt ocr /dev/raw/raw1 clsfmt: Received unexpected error 4 from skgfifi skgfifi: Additional information: -2 Additional information: 1000718336 CHANGES It has been found that the following changes can cause this problem to occur: 1. Use Mutiple Path (MP) disk configuration, may hit this issue. 2. Use EMC device (powerpath**) may hit this issue. But it was not confirmed that these are the only things that can cause this problem to occur, as it has been found that on other hardware and configuration the problem might occur, the key change in this issue is that if the disk size presented from the storage to the cluster nodes are not dividable by 4K the problem should occur. CAUSE This issue is addressed in Bug:4679769 which states that this is a known issue with the clusterware installation on platforms: Linux x86, x86-64 and "IBM Power Based Linux". SOLUTION Before running the root.sh on the first node in the cluster do the following: 1. Download Patch:4679769 from Metalink (contains a patched version of clsfmt.bin). 2. Do the following steps as stated in the patch README to fix the problem: Note: clsfmt.bin need only be replaced on the 1st node of the cluster # Patch Installation Instructions: # -------------------------------- # To apply the patch, unzip the PSE container file: # # p4679769_10201_LINUX.zip # # Set your current directory to the directory where the patch # is located: # # % cd 4679769 # # Copy the clsfmt.bin binary to the $ORACLE_HOME/bin directory where # clsfmt is being run: # # % cp $ORACLE_HOME/bin/clsfmt.bin $ORACLE_HOME/bin/clsfmt.bin.bak # % cp clsfmt.bin $ORACLE_HOME/bin/clsfmt.bin # # Ensure permissions on the clsfmt.bin binary are correct: # # % chmod 755 $ORACLE_HOME/bin/clsfmt.bin
3. Run the root.sh script and proceed with the installation.