10g Clusterware Votedisk 損壞的恢復方法
來源:程序員人生 發布時間:2015-01-18 09:54:32 閱讀次數:3558次
votedisk不管是對RAC(10g Clusterware、11g GI)而言,是非常重要的,我們稱它為仲裁盤,當RAC集群中的某個節點產生故障而脫網掉線時,就由它來判斷是不是將其踢出集群,以保證集群正常運行,當votedisk破壞了,也就會致使集群服務沒法啟動,集群資源都沒法加載,最后致使罷工。那末我們平時就要注意對votedisk的備份,在11g中,由于votedisk和ocr默許就會放進ASM磁盤組,因此可以不用特別關注,但對10g的Cluster來講,由于不能放到ASM磁盤組,只能以raw的情勢使用,因此要特別關注votedisk,定期對其進行備份,如:
用dd命令備份和恢復votedisk的方法:
備份:dd if=/dev/raw/raw3 of=/tmp/votedisk.bak
恢復:dd if=/tmp/votedisk.bak of=/dev/raw/raw3
如果很不幸,之前沒有做過備份,且沒有做過鏡像,當votedisk破壞的時候,就只能對crs進行重建了,下面來演示1下這個進程:
--關閉crs,對votedisk的盤進行破壞,這里是/dev/raw/raw3
[root@rac1 ~]# dd if=/dev/zero of=/dev/raw/raw3 bs=4096 count=12800
再次重啟crs,就提示沒法啟動了,查找ocssd.log日志文件發現,其中有記錄,說明了是磁盤破壞
PS:10g Clusterware的日志入口地址是$ORA_CRS_HOME/log/主機名/...
[ CSSD]2015-01⑴6 09:37:38.327 >USER:
Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2094
Oracle. All rights reserved.
[ CSSD]2015-01⑴6 09:37:38.327 >USER: CSS daemon log for node rac1, number 1, in cluster cluster
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=rac1DBG_CSSD))
[ CSSD]2015-01⑴6 09:37:38.332 [3059615952] >TRACE: clssscmain: local-only set to false
[ CSSD]2015-01⑴6 09:37:38.344 [3059615952] >TRACE: clssnmReadNodeInfo: added node 1 (rac1) to cluster
[ CSSD]2015-01⑴6 09:37:38.352 [3059615952] >TRACE: clssnmReadNodeInfo: added node 2 (rac2) to cluster
[ CSSD]2015-01⑴6 09:37:38.356 [3032808336] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1
[ CSSD]2015-01⑴6 09:37:38.356 [3059615952] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor
[ CSSD]2015-01⑴6 09:37:38.362 [3059615952] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw3)
[ CSSD]2015-01⑴6 09:37:40.381 [3032808336] >TRACE: clssnmvDiskOpen: corrupt kill block on disk (0x09!=0x636c73536b696c4c)
[ CSSD]2015-01⑴6 09:37:40.381 [3032808336] >TRACE: clssnmDiskStateChange: state from 2 to 3 disk (0//dev/raw/raw3)
重建crs很簡單,就履行2個腳本:
1.$ORA_CRS_HOME/install/rootdelete.sh
2.$ORA_CRS_HOME/install/rootdeinstall.sh
節點1:
[root@rac1 install]# ./rootdelete.sh
Shutting down
Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if
Oracle CRS stack is down...
Oracle CRS stack is not running.
Removing script for
Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rac1 install]# ./rootdeinstall.sh
Removing contents from OCR device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 0.590608 seconds, 17.8 MB/s
節點2:
[root@rac2 install]# ./rootdelete.sh
Shutting down
Oracle Cluster Ready Services (CRS):
OCR initialization failed with invalid format: PROC⑵2: The OCR backend has an invalid format
Shutdown has begun. The daemons should exit soon.
Checking to see if
Oracle CRS stack is down...
Oracle CRS stack is not running.
Removing script for
Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
[root@rac2 install]# ./rootdeinstall.sh
Removing contents from OCR device
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 0.627909 seconds, 16.7 MB/s
[root@rac2 install]# dd if=/dev/zero of=/dev/raw/raw3 bs=4096 count=128000
dd: writing `/dev/raw/raw3': No space left on device
25601+0 records in
25600+0 records out
104857600 bytes (105 MB) copied, 5.40456 seconds, 19.4 MB/s
然后重新在2個節點順次履行$ORA_CRS_HOME/root.sh就能夠了,軟件的OUI不用重新安裝
如果通過腳本沒法刪除成功,安裝順利重新安裝crs,可以手工刪除以下目錄:
rm /etc/oracle/*
rm -f /etc/init.d/init.cssd
rm -f /etc/init.d/init.crs
rm -f /etc/init.d/init.crsd
rm -f /etc/init.d/init.evmd
rm -f /etc/rc2.d/K96init.crs
rm -f /etc/rc2.d/S96init.crs
rm -f /etc/rc3.d/K96init.crs
rm -f /etc/rc3.d/S96init.crs
rm -f /etc/rc5.d/K96init.crs
rm -f /etc/rc5.d/S96init.crs
rm -Rf /etc/oracle/scls_scr
rm -f /etc/inittab.crs
cp /etc/inittab.orig /etc/inittab
總結:
平時我們都會對ocr和votedisk磁盤做多個鏡像冗余,另外,如果是裸裝備的話,還會通過dd命令單獨去備份,通常是不太容易破壞和丟失的,萬1產生了無備份情況下的破壞,那末就只能工作重建crs來解決問題了,這就是DBAs們的最后1根救命稻草了。
生活不易,碼農辛苦
如果您覺得本網站對您的學習有所幫助,可以手機掃描二維碼進行捐贈