昨天去某客戶那里做巡檢,順便看一下上次搭建的RAC-DG環(huán)境是否正常,上次的DG是8月20日運(yùn)行的,而DG備庫(kù)從8月31日之后實(shí)例就沒有開啟過,后來詢問后才得知,原來那天斷過一次電,后來重啟了機(jī)器。直到今天我過去了,才把實(shí)例啟動(dòng)起來。也就是說,從8月31日到今天快1個(gè)月的時(shí)間,備庫(kù)一直處于未用狀態(tài)。
接著查看備庫(kù)歸檔,顯然已經(jīng)缺失了很多了,tnread1 最后一個(gè)日志為1661,tnread2 最后一個(gè)日志為1324,而此時(shí)主庫(kù)中還保留的最早的日志是9月8日的,thread 1 最早為2055,thread 2 最早為1555。主備之間歸檔足足差了有好幾百個(gè)(正常,都快一個(gè)月沒開實(shí)例了)
ASMCMD> ls
2014_09_08/
2014_09_09/
2014_09_10/
2014_09_11/
2014_09_12/
2014_09_13/
2014_09_14/
2014_09_15/
2014_09_16/
2014_09_17/
2014_09_18/
2014_09_19/
2014_09_20/
2014_09_21/
2014_09_22/
2014_09_23/
2014_09_24/
2014_09_25/
ASMCMD> cd 2014_09_08
ASMCMD> ls
thread_1_seq_2055.500.857723297
thread_1_seq_2056.494.857725223
thread_1_seq_2057.493.857728031
thread_1_seq_2058.490.857729849
(略)……
thread_2_seq_1555.502.857723297
thread_2_seq_1556.499.857723301
thread_2_seq_1557.497.857723305
thread_2_seq_1558.496.857725225
(略)……
ASMCMD>
盡管在腳本中配置了備份完歸檔后用delete input來刪除歸檔,以減小歸檔占用的磁盤空間,可以在RMAN腳本的備份日志中看到,從8月31日起,陸續(xù)有報(bào)RMAN-08137,提示由于備庫(kù)還未獲得歸檔,導(dǎo)致無法刪除:
歸檔日志文件名=+DATA/sis/archivelog/2014_08_31/thread_1_seq_1661.1208.857041081 RECID=3930 STAMP=857041081
RMAN-08137: 警告: 歸檔日志未刪除, 因?yàn)閭溆没蛏嫌尾东@進(jìn)程需要它
歸檔日志文件名=+DATA/sis/archivelog/2014_08_31/thread_1_seq_1662.1212.857042297 線程=1 序列=1662
RMAN-08137: 警告: 歸檔日志未刪除, 因?yàn)閭溆没蛏嫌尾东@進(jìn)程需要它
歸檔日志文件名=+DATA/sis/archivelog/2014_09_01/thread_2_seq_1325.1204.857122335 線程=2 序列=1325
RMAN-08137: 警告: 歸檔日志未刪除, 因?yàn)閭溆没蛏嫌尾东@進(jìn)程需要它
但是,由于FRA磁盤空間是有限的,使用到一定的百分比(有參數(shù)可調(diào)整),Oracle會(huì)自動(dòng)清空其中的內(nèi)容,以釋放空間,因此在FRA中的歸檔日志大約保留了18天,從9月8日到9月25日,而8月31日的歸檔肯定是沒有的了,最近的備份集只有到9月15日的。
于是決定重新搭建一下DG,關(guān)閉備庫(kù)實(shí)例,刪除全部數(shù)據(jù)庫(kù)文件(數(shù)據(jù)文件、控制文件、日志文件),只保留密碼文件、參數(shù)文件、tnsnames.ora、listener.ora即可,重建很方便,用11g的duplicate重新同步一下就可以了,命令如下:
rman target / auxiliary sys/oracle@sisdg
RMAN> run{
allocate channel c1 device type disk;
allocate auxiliary channel c2 device type disk;
set newname for tempfile 1 to 'D:appadministratororadatasis emp.269.852648395';
duplicate target database for standby from active database dorecover;
release channel c1;
release channel c2;
}
執(zhí)行完以上操作后,備庫(kù)與主庫(kù)的歸檔就同步了
主庫(kù):
SQL> archive log list
數(shù)據(jù)庫(kù)日志模式 存檔模式
自動(dòng)存檔 啟用
存檔終點(diǎn) USE_DB_RECOVERY_FILE_DEST
最早的聯(lián)機(jī)日志序列 2617
下一個(gè)存檔日志序列 2619
當(dāng)前日志序列 2619
SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 2619
2 2556
備庫(kù):
SQL> archive log list
數(shù)據(jù)庫(kù)日志模式 存檔模式
自動(dòng)存檔 啟用
存檔終點(diǎn) D:archivelog
最早的聯(lián)機(jī)日志序列 0
下一個(gè)存檔日志序列 0
當(dāng)前日志序列 0
SQL> select thread#,max(sequence#) from v$archived_log group by thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 2619
2 2556
主備庫(kù)查看了一下v$archive_dest_status,兩邊都是的status列都是valid的,因此開啟備庫(kù)的redo apply,看到日志也開始已經(jīng)應(yīng)用了
SQL> select thread#,sequence#,applied from v$archived_log;
THREAD# SEQUENCE# APPLIED
---------- ---------- ------------------
1 2617 YES
1 2618 YES
2 2556 YES
1 2619 NO
1 2620 NO
由于采用的是LGWR ASYNC模式傳遞日志,再重新創(chuàng)建一次standby redo logfile,主庫(kù)每個(gè)thread有3組日志,所以備庫(kù)創(chuàng)建了7組日志
此外,備庫(kù)的alertlog里還報(bào)了個(gè)錯(cuò)誤,因?yàn)槭菑闹鲙?kù)duplicate過來的,RMAN的配置信息還保留著主庫(kù)的一些參數(shù):
Starting control autobackup
Got error: 19624
******************** WARNING ***************************
The errors during Server autobackup are not fatal, as it
is attempted after sucessful completion of the command.
However, it is recomended to take an RMAN control file
backup as soon as possible because the Autobackup failed
with the following error:
ORA-19624: operation failed, retry possible
ORA-19504: failed to create file "C:ORABACKUPBACKUPSETSSIS1-C-3160648191-20140925-02.CTL"
ORA-27040: file create error, unable to create file
OSD-04002: 無法打開文件
O/S-Error: (OS 3) 系統(tǒng)找不到指定的路徑。
******************** END OF WARNING *******************
RMAN> show all;
使用目標(biāo)數(shù)據(jù)庫(kù)控制文件替代恢復(fù)目錄
db_unique_name 為 SISDG 的數(shù)據(jù)庫(kù)的 RMAN 配置參數(shù)為:
CONFIGURE RETENTION POLICY TO REDUNDANCY 7;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK;
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO 'c:orabackupackupsetssis1-%F.ctl';
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO 'D:APPADMINISTRATORPRODUCT11.2.0DBHOME_1DATABASESNCFSIS.ORA'; # default
備庫(kù)并沒有“ 'c:orabackupackupsets ”這個(gè)路徑,所以報(bào)錯(cuò),clear掉就可以了
RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK CLEAR;
問題到這里似乎是圓滿地解決了,但事實(shí)并非如此,當(dāng)我在節(jié)點(diǎn)1連續(xù)切換了幾次日志后發(fā)現(xiàn),雖然歸檔都能順利傳到備庫(kù),但后續(xù)的日志始終應(yīng)用不了,查看節(jié)點(diǎn)1和備庫(kù)的alertlog,也沒有發(fā)現(xiàn)有什么異常,直到我意識(shí)到應(yīng)該看看節(jié)點(diǎn)2的情況時(shí),才發(fā)現(xiàn)了點(diǎn)端倪。
照理說,正常情況下,在v$archive_dest_status視圖中查詢到的遠(yuǎn)程歸檔路徑的status必須是要valid的,并且error列的狀態(tài)為空,而節(jié)點(diǎn)2的查詢結(jié)果是這樣的:
SQL> select dest_id,status,error from v$archive_dest where dest_id<3;
DEST_ID STATUS ERROR
---------- --------- ------------------------------
1 VALID
2 ERROR ORA-21561: 生成 OID 失敗
此時(shí)再去查看節(jié)點(diǎn)2的alertlog,發(fā)現(xiàn)除了報(bào)ORA-21561,伴隨著的還有ORA-15055、ORA-25253等
Thu Sep 25 15:05:06 2014
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
WARNING: ASM communication error: op 0 state 0x0 (15055)
ERROR: direct connection failure with ASM
Thu Sep 25 15:05:37 2014
Error 21561 received logging on to the standby
Suppressing further error logging of LOG_ARCHIVE_DEST_2.
Thu Sep 25 15:05:46 2014
ORA-25253 encountered when generating server alert SMG-3503
Thu Sep 25 15:05:58 2014
Thread 2 advanced to log sequence 2561 (LGWR switch)
Current log# 6 seq# 2561 mem# 0: +DATA/tc/onlinelog/group_6.1523.854793175
Current log# 6 seq# 2561 mem# 1: +DATA/tc/onlinelog/group_6.1524.854793175
Thu Sep 25 15:05:58 2014
Archived Log entry 6148 added for thread 2 sequence 2560 ID 0xbc63a5fc dest 1:
Thu Sep 25 15:05:58 2014
Error 21561 received logging on to the standby
Suppressing further error logging of LOG_ARCHIVE_DEST_2.
FAL[server, ARC3]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
Thu Sep 25 15:07:49 2014
ORA-25253 encountered when generating server alert SMG-3503
Thu Sep 25 15:08:07 2014
ARCH: Possible network disconnect with primary database
Thu Sep 25 15:11:38 2014
Error 21561 received logging on to the standby
在節(jié)點(diǎn)2執(zhí)行l(wèi)snrctl status查看監(jiān)聽狀態(tài)以及用tnsping NET SERVICE NAME,都會(huì)hang在那邊,不顯示結(jié)果
于是到MOS上面查閱了一下關(guān)于ORA-21561和ORA-15055的文檔,有好幾篇相關(guān)文檔都有提到這一現(xiàn)象,主要描述如下:
1. 通常在10.2.0.x升級(jí)后發(fā)生,登錄SQLPLUS時(shí)直接提示:ORA - 21561 : OID GENERATION FAILED,需要在.../hosts中加入localhost的完整主機(jī)名,我可以正常登錄SQLPLUS,似乎這個(gè)不太符合我碰到的情形。
2. 由bug引起,Bug 14324057 and Bug 12529945有潛在消耗desktop heap memory的可能,需要修改注冊(cè)表,來增加缺省desktop heap size來解決。
文檔中還提到,bug針對(duì)的是11.2.0.2 patch 19――11.2.0.3 patch 7,但這個(gè)庫(kù)已經(jīng)打到patch 27了,如果是bug也應(yīng)該已經(jīng)修復(fù)了。于是嘗試修改了注冊(cè)表,修改HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlSession
ManagerSubSystems下面的
SharedSection=1024,20480,1024,修改前先備份一下注冊(cè)表
生活不易,碼農(nóng)辛苦
如果您覺得本網(wǎng)站對(duì)您的學(xué)習(xí)有所幫助,可以手機(jī)掃描二維碼進(jìn)行捐贈(zèng)
------分隔線----------------------------
------分隔線----------------------------