首页  >  文章  >  数据库  >  RAC环境中HACMP的vg为non-conrrent的解决经历

RAC环境中HACMP的vg为non-conrrent的解决经历

WBOY
WBOY原创
2016-06-07 17:26:321368浏览

在rac环境中,HACMP的vg为unconcurrent状态,是多么糟糕的一件事,而这个不幸就在某生产系统上发生了。环境介绍:AIX6.1的系统,使用的是EMCCLARiiON存储,oracl

环境介绍

AIX 6.1的系统,使用的是EMC CLARiiON存储,oracle10.2.0.5

问题状况:

   先看下各个卷组的状态

data03vg

lsvg data03vg VOLUME GROUP: data03vg VG IDENTIFIER: 00f79d1100004c00000001386f00edfb VG STATE: active PP SIZE: 128 megabyte (s) VG PERMISSION: read/write TOTAL PPs: 5315 (680320 megabytes) MAX LVs: 512 FREE PPs: 711 (91008 megabytes) LVs: 78 USED PPs: 4604 (589312 megabytes) OPEN LVs: 0 QUORUM: 3 (Enabled) TOTAL PVs: 5 VG DESCRIPTORS: 5 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 5 AUTO ON: no Concurrent: Enhanced-Capable Auto-Concurrent: Disabled VG Mode: Non-Concurrent MAX PPs per VG: 130048 MAX PPs per PV: 2032 MAX PVs: 64 LTG size (Dynamic): 1024 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable PV RESTRICTION: none INFINITE RETRY: no

data01vg

lsvg data01vg VOLUME GROUP: data01vg VG IDENTIFIER: 00f79d1100004c00000001386effcc48 VG STATE: active PP SIZE: 128 megabyte (s) VG PERMISSION: read/write TOTAL PPs: 6378 (816384 megabytes) MAX LVs: 512 FREE PPs: 1146 (146688 megabytes) LVs: 88 USED PPs: 5232 (669696 megabytes) OPEN LVs: 0 QUORUM: 4 (Enabled) TOTAL PVs: 6 VG DESCRIPTORS: 6 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 6 AUTO ON: no Concurrent: Enhanced-Capable Auto-Concurrent: Disabled VG Mode: Non-Concurrent MAX PPs per VG: 130048 MAX PPs per PV: 2032 MAX PVs: 64 LTG size (Dynamic): 1024 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable PV RESTRICTION: none INFINITE RETRY: no

data02vg

lsvg data02vg VOLUME GROUP: data02vg VG IDENTIFIER: 00f79d1100004c00000001386f007c90 VG STATE: active PP SIZE: 128 megabyte (s) VG PERMISSION: read/write TOTAL PPs: 2126 (272128 megabytes) MAX LVs: 512 FREE PPs: 18 (2304 megabytes) LVs: 39 USED PPs: 2108 (269824 megabytes) OPEN LVs: 0 QUORUM: 2 (Enabled) TOTAL PVs: 2 VG DESCRIPTORS: 3 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 2 AUTO ON: no Concurrent: Enhanced-Capable Auto-Concurrent: Disabled VG Mode: Non-Concurrent MAX PPs per VG: 130048 MAX PPs per PV: 2032 MAX PVs: 64 LTG size (Dynamic): 1024 kilobyte(s) AUTO SYNC: no HOT SPARE: no BB POLICY: relocatable PV RESTRICTION: none INFINITE RETRY: no

vgpv的状态:

data03vg

lsvg -p data03vg data03vg: PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdiskpower11 active 1063 43 01..00..00..00..42 hdiskpower17 removed 1063 167 21..00..00..00..146 hdiskpower18 removed 1063 167 21..00..00..00..146 hdiskpower19 removed 1063 167 21..00..00..00..146 hdiskpower20 removed 1063 167 21..00..00..00..146

data01vg

lsvg -p data01vg data01vg: PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdiskpower7 active 1063 0 00..00..00..00..00 hdiskpower8 active 1063 20 00..00..00..00..20 hdiskpower9 active 1063 24 02..00..00..00..22 hdiskpower10 active 1063 0 00..00..00..00..00 hdiskpower16 missing 1063 551 21..00..105..212..213 hdiskpower21 missing 1063 551

data02vg

lsvg -p data02vg data02vg: PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdiskpower0 active 1063 0 00..00..00..00..00 hdiskpower24 active 1063 18 00..00..00..00..18

有好多盘不是missing就是removed的,数据库日志报错为:

Thu Mar 21 17:53:58 BEIST 2013

Errors in

file /oracle/app/oracle/admin/ctsdb/bdump/ctsdb2_m000_19595456.trc:

ORA-27072: File I/O error

IBM AIX RISC System/6000 Error: 5: I/O error

odmget HACMPdisktype HACMPdisktype: PdDvLn = "disk/pseudo/power" ghostdisks = "SCSI3" checkres = "SCSI_TUR" breakres = "/usr/lpp/EMC/Symmetrix/bin/emcpowerreset" parallel = "false" makedev = "MKDEV" reserved1 = "" reserved2 = "" reserved3 = ""lssrc –a | grep cl clcomd caa 7929856 active clcomdES clcomdES 9633858 active clstrmgrES cluster 9240596 active gsclvmd inoperative clinfoES cluster 17104944 active clconfd caa inoperative nimsh nimclient inoperative

两节点的gsclvmd 都是inoperative,看来只能重启hacmp来把gsclvmd给拉起来。

解决过程

1.先进行数据库的备份,,然后停库

节点1: su – oracle srvctl stop listener –n ctscrm1 ps –ef | grep “LOCAL=NO”| grep –v grep | awk ‘{print $2}’|xargs kill -9 oracle> alter system switch logfile; oracle> alter system checkpoint; srvctl stop instance –d ctsdb –I ctsdb1

节点2: su – oracle srvctl stop listener –n ctscrm2 ps –ef | grep “LOCAL=NO”| grep –v grep | awk ‘{print $2}’|xargs kill -9 oracle> alter system switch logfile; oracle> alter system checkpoint; srvctl stop instance –d ctsdb –I ctsdb2

关闭crs:节点1和节点2

crsctlstop crs

2.重启hacmp

smit clstop
声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn