今天在Oracle 12c RAC下进行了破坏ocr和votedisk之后恢复的实验,基本和11g RAC相差无异,下面将实验过程分享一下。 实验环境:2-NODES Oracle Database 12c RAC on Linux6(OEL 6.4) 查看表决磁盘和Ocr相关信息 [root@12crac1 ~]# cd /u01/app/12.1.0/grid/b
今天在Oracle 12c RAC下进行了破坏ocr和votedisk之后恢复的实验,基本和11g RAC相差无异,下面将实验过程分享一下。
实验环境:2-NODES Oracle Database 12c RAC on Linux6(OEL 6.4)
查看表决磁盘和Ocr相关信息
[root@12crac1 ~]# cd /u01/app/12.1.0/grid/bin/ [root@12crac1 bin]# ./crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE d883c23a7bfc4fdcbf418c9f631bd0af (/dev/asm-crs) [RACCRS] Located 1 voting disk(s). [root@12crac1 bin]# ./ocrcheck Status of Oracle Cluster Registry is as follows : Version : 4 Total space (kbytes) : 409568 Used space (kbytes) : 1608 Available space (kbytes) : 407960 ID : 1658916461 Device/File Name : +RACCRS Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded
查看当前ocr备份情况
[root@12crac1 bin]# ./ocrconfig -showbackup 12crac1 2013/07/02 23:21:22 /u01/app/12.1.0/grid/cdata/scan12c/backup00.ocr 12crac1 2013/07/02 19:21:21 /u01/app/12.1.0/grid/cdata/scan12c/backup01.ocr 12crac1 2013/07/01 04:52:41 /u01/app/12.1.0/grid/cdata/scan12c/backup02.ocr 12crac1 2013/07/01 04:52:41 /u01/app/12.1.0/grid/cdata/scan12c/day.ocr 12crac1 2013/07/01 04:52:41 /u01/app/12.1.0/grid/cdata/scan12c/week.ocr 12crac1 2013/07/01 00:48:56 /u01/app/12.1.0/grid/cdata/scan12c/backup_20130701_004856.ocr 12crac1 2013/07/01 00:39:40 /u01/app/12.1.0/grid/cdata/scan12c/backup_20130701_003940.ocr
可以如下方式进行手工备份
[root@12crac1 bin]# ./ocrconfig -local -manualbackup 12crac1 2013/07/21 17:55:10 /u01/app/12.1.0/grid/cdata/12crac1/backup_20130721_175510.olr 12crac1 2013/07/01 00:39:39 /u01/app/12.1.0/grid/cdata/12crac1/backup_20130701_003939.olr
查看RAC资源服务状态
[grid@12crac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.LISTENER.lsnr ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACCRS.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACDATA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACFRA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.asm ONLINE ONLINE 12crac1 Started,STABLE ONLINE ONLINE 12crac2 Started,STABLE ora.net1.network ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.ons ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.12crac1.vip 1 ONLINE ONLINE 12crac1 STABLE ora.12crac2.vip 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE 12crac1 STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE 12crac1 STABLE ora.MGMTLSNR 1 ONLINE ONLINE 12crac1 169.254.88.173 192.1 68.80.150,STABLE ora.cvu 1 ONLINE ONLINE 12crac1 STABLE ora.luocs12c.db 1 ONLINE ONLINE 12crac2 Open,STABLE 2 ONLINE ONLINE 12crac1 Open,STABLE ora.mgmtdb 1 ONLINE ONLINE 12crac1 Open,STABLE ora.oc4j 1 ONLINE ONLINE 12crac1 STABLE ora.scan1.vip 1 ONLINE ONLINE 12crac2 STABLE ora.scan2.vip 1 ONLINE ONLINE 12crac1 STABLE ora.scan3.vip 1 ONLINE ONLINE 12crac1 STABLE --------------------------------------------------------------------------------
用ASMCMD的md_backup命令备份磁盘组,顺便查看该磁盘组里都存放什么。
[root@12crac2 ~]# su - grid [grid@12crac2 ~]$ asmcmd -p ASMCMD [+] > md_backup /home/grid/ocrvote.bak -G RACCRS Disk group metadata to be backed up: RACCRS Current alias directory path: scan12c Current alias directory path: ASM Current alias directory path: _MGMTDB/CONTROLFILE Current alias directory path: _MGMTDB/TEMPFILE Current alias directory path: ASM/PASSWORD Current alias directory path: _MGMTDB/ONLINELOG Current alias directory path: _MGMTDB Current alias directory path: scan12c/OCRFILE Current alias directory path: _MGMTDB/DATAFILE Current alias directory path: scan12c/ASMPARAMETERFILE Current alias directory path: _MGMTDB/PARAMETERFILE -- 从这里可以看出,在Oracle 12c RAC中,存放ocr的磁盘组里多了不少文件,有_MGMTDB相关文件以及ASM的PASSWORD。 下面是11g RAC中存放OCR的磁盘组内容 ASMCMD [+] > md_backup /home/grid/ocrvote.bak -G hk_crs Disk group metadata to be backed up: HK_CRS Current alias directory path: racscan/OCRFILE Current alias directory path: racscan Current alias directory path: racscan/ASMPARAMETERFILE
也可以导出ocr的内容
[root@12crac1 bin]# ./ocrconfig -export /home/grid/ocr.bak
以下方式都无法删除当前使用的ocr内容
ASMCMD [+] > rm -rf /raccrs/scan12c/ocrfile ORA-29261: bad argument ORA-15178: directory 'ocrfile' is not empty; cannot drop this directory ORA-15028: ASM file '+RACCRS.255.819592481' not dropped; currently being accessed ORA-06512: at line 4 (DBD ERROR: OCIStmtExecute) ASMCMD [+] > cd /raccrs/scan12c/ocrfile ASMCMD [+raccrs/scan12c/ocrfile] > ls REGISTRY.255.819592481 ASMCMD [+raccrs/scan12c/ocrfile] > rm -rf REGISTRY.255.819592481 ORA-15032: not all alterations performed ORA-15028: ASM file '+raccrs/scan12c/ocrfile/REGISTRY.255.819592481' not dropped; currently being accessed (DBD ERROR: OCIStmtExecute)
那我们可以破坏存放ocr的设备文件
[root@12crac1 bin]# dd if=/dev/zero of=/dev/sdg bs=1024k count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.00335689 s, 312 MB/s
然后停止集群:
[root@12crac1 bin]# ./crsctl stop has [root@12crac2 bin]# ./crsctl stop has -f
或crsctl stop crs [-f]也可以
尝试启动clusterware,发现clusterware无法正常启动
[root@12crac1 bin]# ./crsctl start has CRS-4123: Oracle High Availability Services has been started. [grid@12crac1 ~]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager
查看集群日志:
2013-07-20 23:46:01.413: [ohasd(18692)]CRS-0714:Oracle Clusterware Release 12.1.0.1.0 - Production Copyright 1996, 2010 Oracle. All rights reserved. 2013-07-20 23:46:01.451: [ohasd(18692)]CRS-2112:The OLR service started on node 12crac1. 2013-07-20 23:46:01.494: [ohasd(18692)]CRS-1301:Oracle High Availability Service started on node 12crac1. 2013-07-20 23:46:01.498: [ohasd(18692)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2013-07-20 23:46:10.768: [gpnpd(19041)]CRS-2328:GPNPD started on node 12crac1. 2013-07-20 23:46:42.712: [cssd(19212)]CRS-1713:CSSD daemon is started in hub mode 2013-07-20 23:46:43.221: [cssd(19212)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log 2013-07-20 23:46:44.142: [ohasd(18692)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE 2013-07-20 23:46:44.143: [ohasd(18692)]CRS-2769:Unable to failover resource 'ora.diskmon'. 2013-07-20 23:46:58.280: [cssd(19212)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log
查看/u01/app/12.1.0/grid/log/12crac1/cssd/ocssd.log日志
2013-07-20 23:48:13.450: [ GPNP][1105622784]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2207] get-profile call to url "ipc://GPNPD_12crac1" disco "" [f=0 claimed- host: cname: cguid: cli:gpnp p:19212 role: seq: ep: auth: diag:[]] 2013-07-20 23:48:13.476: [ GPNP][1105622784]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2360] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_12crac1" disco "" 2013-07-20 23:48:13.477: [ CSSD][1105622784]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*) 2013-07-20 23:48:13.478: [ CSSD][1105622784]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery 2013-07-20 23:48:13.478: [ SKGFD][1105622784]Discovery with str:/dev/asm*: 2013-07-20 23:48:13.478: [ SKGFD][1105622784]UFS discovery with :/dev/asm*: 2013-07-20 23:48:13.491: [ SKGFD][1105622784]Fetching UFS disk :/dev/asm-data: 2013-07-20 23:48:13.491: [ SKGFD][1105622784]Fetching UFS disk :/dev/asm-fra: 2013-07-20 23:48:13.492: [ SKGFD][1105622784]Fetching UFS disk :/dev/asm-crs: 2013-07-20 23:48:13.492: [ SKGFD][1105622784]Fetching UFS disk :/dev/asm-extcrs: 2013-07-20 23:48:13.492: [ SKGFD][1105622784]Fetching UFS disk :/dev/asm: 2013-07-20 23:48:13.492: [ SKGFD][1105622784]OSS discovery with :/dev/asm*: 2013-07-20 23:48:13.495: [ SKGFD][1105622784]Handle 0x7f8c10170500 from lib :UFS:: for disk :/dev/asm-data: 2013-07-20 23:48:13.498: [ SKGFD][1105622784]Handle 0x7f8c1016e8a0 from lib :UFS:: for disk :/dev/asm-fra: 2013-07-20 23:48:13.500: [ SKGFD][1105622784]Handle 0x7f8c1016f0d0 from lib :UFS:: for disk :/dev/asm-crs: 2013-07-20 23:48:13.501: [ SKGFD][1105622784]Handle 0x7f8c1011c4e0 from lib :UFS:: for disk :/dev/asm-extcrs: 2013-07-20 23:48:13.501: [ SKGFD][1105622784]Lib :UFS:: closing handle 0x7f8c10170500 for disk :/dev/asm-data: 2013-07-20 23:48:13.501: [ SKGFD][1105622784]Lib :UFS:: closing handle 0x7f8c1016e8a0 for disk :/dev/asm-fra: 2013-07-20 23:48:13.501: [ SKGFD][1105622784]Lib :UFS:: closing handle 0x7f8c1016f0d0 for disk :/dev/asm-crs: 2013-07-20 23:48:13.502: [ SKGFD][1105622784]Lib :UFS:: closing handle 0x7f8c1011c4e0 for disk :/dev/asm-extcrs: 2013-07-20 23:48:13.503: [ CSSD][1105622784]clssnmvDiskVerify: Successful discovery of 0 disks 2013-07-20 23:48:13.503: [ CSSD][1105622784]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2013-07-20 23:48:13.503: [ CSSD][1105622784]clssnmvFindInitialConfigs: No voting files found 2013-07-20 23:48:13.503: [ CSSD][1105622784](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
– 我们可以看出表决磁盘无法找到等报错信息。
下面将集群关闭,尝试恢复。
[root@12crac1 bin]# ./crsctl stop has -f
ocr和vote disk损坏恢复步骤大致如下:
1)停止所有节点clusterware
# crsctl stop crs
# crsctl stop crs -f
2)以root用户在其中一个节点度扎模式启动clusterware
# crsctl start crs -excl -nocrs
备注:如果发现crsd在运行,那么通过如下命令将之停止。
# crsctl stop resource ora.crsd -init
3)创建新的存放ocr和vote disk的磁盘组,磁盘组名和原有的一致(如果想改变位置,需修改/etc/oracle/ocr.loc文件)
备注:如发现无法创建等情况,可以采用如下删除磁盘组等排错思路
SQL> drop diskgroup disk_group_name force including contents;
4)还原ocr,并检查
# ocrconfig -restore file_name
# ocrcheck
5)恢复表决磁盘,并检查
# crsctl replace votedisk +asm_disk_group
# crsctl query css votedisk
6)停止独占模式运行的clusterware
# crsctl stop crs -f
7)所有节点正常启动clusterware
# crsctl start crs
8)CVU验证所有RAC节点OCR的完整性
$ cluvfy comp ocr -n all -verbose
下面开始演示操作,独占模式运行clusterware [root@12crac1 bin]# ./crsctl start crs -excl -nocrs CRS-4123: Oracle High Availability Services has been started. CRS-2673: Attempting to stop 'ora.drivers.acfs' on '12crac2' CRS-2677: Stop of 'ora.drivers.acfs' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.evmd' on '12crac2' CRS-2672: Attempting to start 'ora.mdnsd' on '12crac2' CRS-2676: Start of 'ora.evmd' on '12crac2' succeeded CRS-2676: Start of 'ora.mdnsd' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on '12crac2' CRS-2676: Start of 'ora.gpnpd' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on '12crac2' CRS-2672: Attempting to start 'ora.gipcd' on '12crac2' CRS-2676: Start of 'ora.cssdmonitor' on '12crac2' succeeded CRS-2676: Start of 'ora.gipcd' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.cssd' on '12crac2' CRS-2672: Attempting to start 'ora.diskmon' on '12crac2' CRS-2676: Start of 'ora.diskmon' on '12crac2' succeeded CRS-2676: Start of 'ora.cssd' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.drivers.acfs' on '12crac2' CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on '12crac2' CRS-2672: Attempting to start 'ora.ctssd' on '12crac2' CRS-2676: Start of 'ora.drivers.acfs' on '12crac2' succeeded CRS-2676: Start of 'ora.ctssd' on '12crac2' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on '12crac2' succeeded CRS-2672: Attempting to start 'ora.asm' on '12crac2' CRS-2676: Start of 'ora.asm' on '12crac2' succeeded 通过grid用户登录sqlplus创建ASM磁盘组 [grid@12crac2 ~]$ sqlplus / as sysasm SQL*Plus: Release 12.1.0.1.0 Production on Sun Jul 21 00:11:46 2013 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> create diskgroup raccrs external redundancy disk '/dev/asm-extcrs' attribute 'compatible.asm' = '12.1.0.0.0'; Diskgroup created. 通过ocrconfig还原Ocr [root@12crac1 bin]# ./ocrconfig -import /home/grid/ocr.bak 或者 [root@12crac1 bin]# ./ocrconfig -restore /u01/app/12.1.0/grid/cdata/scan12c/backup00.ocr 查看表决磁盘信息,当前无法找到 [root@12crac1 bin]# ./crsctl query css votedisk Located 0 voting disk(s). 恢复表决磁盘,可能会遇到如下问题 [root@12crac1 bin]# ./crsctl replace votedisk +RACCRS CRS-4602: Failed 27 to add voting file 3782393479bf4f07bf313dc5a8f4c58a. Failed to replace voting disk group with +RACCRS. CRS-4000: Command Replace failed, or completed with errors. 此问题需要重新配置一下ASM的参数并重启ASM来解决。 [grid@12crac1 ~]$ sqlplus / as sysasm SQL*Plus: Release 12.1.0.1.0 Production on Sun Jul 21 00:40:01 2013 Copyright (c) 1982, 2013, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/asm*'; System altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 1135747072 bytes Fixed Size 2297344 bytes Variable Size 1108283904 bytes ASM Cache 25165824 bytes ASM diskgroups mounted ASM diskgroups volume enabled 重新恢复表决磁盘 [root@12crac1 bin]# ./crsctl replace votedisk +RACCRS Successful addition of voting disk 1499cddff03a4f86bf01599718febcb1. Successfully replaced voting disk group with +RACCRS. CRS-4266: Voting file(s) successfully replaced [root@12crac1 bin]# ./crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 1499cddff03a4f86bf01599718febcb1 (/dev/asm-extcrs) [RACCRS] Located 1 voting disk(s). 退出独占模式: [root@12crac1 bin]# ./crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on '12crac1' CRS-2673: Attempting to stop 'ora.ctssd' on '12crac1' CRS-2673: Attempting to stop 'ora.mdnsd' on '12crac1' CRS-2673: Attempting to stop 'ora.drivers.acfs' on '12crac1' CRS-2673: Attempting to stop 'ora.gpnpd' on '12crac1' CRS-2677: Stop of 'ora.drivers.acfs' on '12crac1' succeeded CRS-2677: Stop of 'ora.mdnsd' on '12crac1' succeeded CRS-2677: Stop of 'ora.gpnpd' on '12crac1' succeeded CRS-2677: Stop of 'ora.ctssd' on '12crac1' succeeded CRS-2673: Attempting to stop 'ora.evmd' on '12crac1' CRS-2673: Attempting to stop 'ora.asm' on '12crac1' CRS-2677: Stop of 'ora.evmd' on '12crac1' succeeded CRS-2677: Stop of 'ora.asm' on '12crac1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on '12crac1' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on '12crac1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on '12crac1' CRS-2677: Stop of 'ora.cssd' on '12crac1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on '12crac1' CRS-2677: Stop of 'ora.gipcd' on '12crac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on '12crac1' has completed CRS-4133: Oracle High Availability Services has been stopped. 所有节点都正常启动: [root@12crac1 bin]# ./crsctl start has CRS-4123: Oracle High Availability Services has been started. [root@12crac2 bin]# ./crsctl start has CRS-4123: Oracle High Availability Services has been started. 查看clusterware运行状态 [grid@12crac1 ~]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online [grid@12crac2 ~]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online 查看所有资源状态 [grid@12crac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.LISTENER.lsnr ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACCRS.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACDATA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACFRA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.asm ONLINE ONLINE 12crac1 Started,STABLE ONLINE ONLINE 12crac2 Started,STABLE ora.net1.network ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.ons ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.12crac1.vip 1 ONLINE ONLINE 12crac1 STABLE ora.12crac2.vip 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE 12crac1 STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE 12crac1 STABLE ora.MGMTLSNR 1 ONLINE ONLINE 12crac2 169.254.171.71 192.1 68.80.154,STABLE ora.cvu 1 ONLINE ONLINE 12crac1 STABLE ora.luocs12c.db 1 ONLINE ONLINE 12crac2 Open,STABLE 2 ONLINE ONLINE 12crac1 Open,STABLE ora.mgmtdb 1 ONLINE OFFLINE Instance Shutdown,ST ABLE ora.oc4j 1 ONLINE ONLINE 12crac1 STABLE ora.scan1.vip 1 ONLINE ONLINE 12crac2 STABLE ora.scan2.vip 1 ONLINE ONLINE 12crac1 STABLE ora.scan3.vip 1 ONLINE ONLINE 12crac1 STABLE --------------------------------------------------------------------------------
在这里我们会发现mgmtdb没有正常启动,手动尝试启动,会遇到问题。
[grid@12crac1 ~]$ srvctl start mgmtdb PRCR-1079 : Failed to start resource ora.mgmtdb CRS-5017: The resource action "ora.mgmtdb start" encountered the following error: ORA-01078: failure in processing system parameters LRM-00109: could not open parameter file '/u01/app/12.1.0/grid/dbs/init-MGMTDB.ora' . For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/12crac2/agent/crsd/oraagent_grid/oraagent_grid.log". CRS-2674: Start of 'ora.mgmtdb' on '12crac2' failed CRS-5017: The resource action "ora.mgmtdb start" encountered the following error: ORA-01078: failure in processing system parameters ORA-01565: error in identifying file '+RACCRS/_mgmtdb/spfile-MGMTDB.ora' ORA-17503: ksfdopn:2 Failed to open file +RACCRS/_mgmtdb/spfile-MGMTDB.ora ORA-15056: additional error message ORA-17503: ksfdopn:2 Failed to open file +RACCRS/_mgmtdb/spfile-mgmtdb.ora ORA-15173: entry '_mgmtdb' does not exist in directory '/' ORA-06512: at line 4 . For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/12crac1/agent/crsd/oraagent_grid/oraagent_grid.log". CRS-2674: Start of 'ora.mgmtdb' on '12crac1' failed CRS-2632: There are no more servers to try to place resource 'ora.mgmtdb' on that would satisfy its placement policy
– 此问题造成原因是,我们dd了存放ocr的ASM磁盘组之后,里面的_MGMTDB相关文件也都将损坏丢失。从报错信息可见无法找到参数文件。
查看mgmtdb配置信息
[grid@12crac1 ~]$ srvctl config mgmtdb -all -verbose Database unique name: _mgmtdb Database name: Oracle home: /u01/app/12.1.0/grid Oracle user: grid Spfile: +RACCRS/_mgmtdb/spfile-MGMTDB.ora Password file: Domain: Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Database instance: -MGMTDB Type: Management Database is enabled
目前还不知如何修复mgmtdb的方法,因此我remove了下
[grid@12crac1 ~]$ srvctl remove mgmtdb Remove the database _mgmtdb? (y/[n]) y [grid@12crac1 ~]$ crsctl stat res -t -------------------------------------------------------------------------------- Name Target State Server State details -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.LISTENER.lsnr ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACCRS.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACDATA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.RACFRA.dg ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.asm ONLINE ONLINE 12crac1 Started,STABLE ONLINE ONLINE 12crac2 Started,STABLE ora.net1.network ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE ora.ons ONLINE ONLINE 12crac1 STABLE ONLINE ONLINE 12crac2 STABLE -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.12crac1.vip 1 ONLINE ONLINE 12crac1 STABLE ora.12crac2.vip 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN2.lsnr 1 ONLINE ONLINE 12crac2 STABLE ora.LISTENER_SCAN3.lsnr 1 ONLINE ONLINE 12crac2 STABLE ora.MGMTLSNR 1 ONLINE ONLINE 12crac2 169.254.171.71 192.1 68.80.154,STABLE ora.cvu 1 ONLINE ONLINE 12crac1 STABLE ora.luocs12c.db 1 ONLINE ONLINE 12crac2 Open,STABLE 2 ONLINE ONLINE 12crac1 Open,STABLE ora.oc4j 1 ONLINE ONLINE 12crac1 STABLE ora.scan1.vip 1 ONLINE ONLINE 12crac2 STABLE ora.scan2.vip 1 ONLINE ONLINE 12crac2 STABLE ora.scan3.vip 1 ONLINE ONLINE 12crac2 STABLE --------------------------------------------------------------------------------
mgmtdb备份还原以及修复等方法,有待研究,本次实验先到这里。
原文地址:Oracle Database 12c RAC损坏ocr和votedisk恢复实验, 感谢原作者分享。