Today I was attempting to rebuild a standby database as we
had done some mess up with the production by placing some of the tables in
"nologging" mode and relocating few datafiles due to file system
space issues. It is easy to recover the standby by replacing the files from the
primary and using the standby_management=manual mode to relocate the files in
the standby as well. But the problem is we don't know where we started and
where the situation is... Primary database is running fine after the maintenance
but we lack the high availability as the standby is a day back. Hence I have
decided to rebuild the standby which will take not more than 2~3 hours with a
valid backup. Remember this is a database of version 10.2.0.4.0 running on
Linux EL5.
Steps that has been done.
1. Backup taken from source database (Primary)
2. Backup pieces were moved to target server and the some of
the backup pieces on the primary server is deleted as thinking as this DB is a
11g DB where we have feature of source database is not required to create
clone. Then after realising the DB as 10g, again replaced the backups on the
server.
3. Control file backup is taken for standby and transfered
to the target server.
Now the actual story starts.
On target server.
RMAN> run
{
configure channel
device type disk parallel 8;
duplicate target
database for standby nofilenamecheck dorecover;
}
2> 3>
RMAN-00571:
===========================================================
RMAN-00569:
=============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571:
===========================================================
RMAN-00558: error
encountered while parsing input commands
RMAN-01009:
syntax error: found "parallel": expecting one of: "clear,
connect, debug, format, kbytes, maxopenfiles, maxpiecesize, parms, readrate,
rate, send, to, trace"
RMAN-01007: at
line 3 column 36 file: standard input
RMAN>
Starting
Duplicate Db at 06-AUG-12
using target
database control file instead of recovery catalog
allocated
channel: ORA_AUX_DISK_1
channel
ORA_AUX_DISK_1: sid=541 devtype=DISK
allocated
channel: ORA_AUX_DISK_2
channel
ORA_AUX_DISK_2: sid=540 devtype=DISK
allocated
channel: ORA_AUX_DISK_3
channel
ORA_AUX_DISK_3: sid=539 devtype=DISK
allocated
channel: ORA_AUX_DISK_4
channel
ORA_AUX_DISK_4: sid=538 devtype=DISK
allocated
channel: ORA_AUX_DISK_5
channel
ORA_AUX_DISK_5: sid=537 devtype=DISK
contents of
Memory Script:
{
set until scn 5170555877;
restore clone standby controlfile;
sql clone 'alter database mount standby
database';
}
executing Memory
Script
..
..
..
executing
command: SET NEWNAME
executing
command: SET NEWNAME
executing
command: SET NEWNAME
Starting restore
at 06-AUG-12
allocated
channel: ORA_AUX_DISK_1
channel
ORA_AUX_DISK_1: sid=537 devtype=DISK
allocated
channel: ORA_AUX_DISK_2
channel
ORA_AUX_DISK_2: sid=538 devtype=DISK
allocated
channel: ORA_AUX_DISK_3
channel
ORA_AUX_DISK_3: sid=539 devtype=DISK
allocated
channel: ORA_AUX_DISK_4
channel
ORA_AUX_DISK_4: sid=540 devtype=DISK
allocated
channel: ORA_AUX_DISK_5
channel
ORA_AUX_DISK_5: sid=541 devtype=DISK
creating datafile
fno=2 name=/fs05/oradata/edisipd/edisipd_UNDOTBS_01.dbf
creating datafile
fno=5 name=/fs05/oradata/edisipd/edisipd_UNDOTBS_02.dbf
creating datafile
fno=8 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_03.dbf
creating datafile
fno=15 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_10.dbf
creating datafile
fno=17 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_12.dbf
creating datafile
fno=18 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_13.dbf
creating datafile
fno=19 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_14.dbf
creating datafile
fno=23 name=/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_18.dbf
..
..
..
using channel
ORA_AUX_DISK_2
using channel
ORA_AUX_DISK_3
using channel
ORA_AUX_DISK_4
using channel
ORA_AUX_DISK_5
starting media
recovery
archive log
thread 1 sequence 48527 is already on disk as file
/fs01/oradata/edisipd/archive/edisipd_/1_724685163_48527.arc
Oracle Error:
ORA-01547:
warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1
needs more recovery to be consistent
ORA-01110: data
file 1: '/fs02/oradata/edisipd/edisipd_SYSTEM_01.dbf'
RMAN-00571:
===========================================================
RMAN-00569:
=============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571:
===========================================================
RMAN-03002:
failure of Duplicate Db command at 08/07/2012 02:49:36
RMAN-03015: error
occurred in stored script Memory Script
RMAN-06053:
unable to perform media recovery because of missing log
RMAN-06025: no
backup of log thread 1 seq 48412 lowscn 5159567844 found to restore
RMAN-06025: no
backup of log thread 1 seq 48411 lowscn 5159485436 found to restore
RMAN-06025: no
backup of log thread 1 seq 48410 lowscn 5159410674 found to restore
RMAN-06025: no
backup of log thread 1 seq 48409 lowscn 5159350092 found to restore
RMAN-06025: no
backup of log thread 1 seq 48408 lowscn 5159263205 found to restore
Check the part where it states "creating datafile
fno=......
This is the first time I come across this statement in
recovery (standby/clone) of a database. I'm not sure why it is creating file
rather recovering it from the backup. I waited for sometime so that the recover
completes.
Recover completes with error stating missing logs. Ok, this
is expected as I don’t have some of the logs in the standby site.
Now, I started the recovery to see where the MRP stuck does,
I mean which log MRP is looking for.
SQL> archive
log list
Database log
mode Archive Mode
Automatic archival Enabled
Archive
destination
/fs01/oradata/edisipd/archive/edisipd_
Oldest online log
sequence 48550
Next log sequence
to archive 0
Current log
sequence 48552
SQL> select
process, sequence#, status from v$managed_Standby;
PROCESS SEQUENCE# STATUS
---------
---------- ------------
ARCH 0 CONNECTED
ARCH 0 CONNECTED
ARCH 0 CONNECTED
ARCH 0 CONNECTED
ARCH 0 CONNECTED
RFS 0 IDLE
RFS 0 IDLE
RFS 0 IDLE
MRP0 3 WAIT_FOR_GAP
9 rows selected.
This is to my shock!! MRP is waiting for sequence number 3.
Man! This is a full backup you are recovering and why in the name of God, MRP
looks for sequence# 3? I have to go years back for the log sequence# 3 :-)
I thought for a second and reran the create controlfile
statement on the primary database to make sure I have the correct controlfile, transferred
it to standby site and started recovery after mounting the database.
SQL> recover
standby database;
ORA-00279: change
310139 generated at 07/18/2010 13:52:22 needed for thread 1
ORA-00289:
suggestion :
/fs01/oradata/edisipd/archive/edisipd_/1_724685163_3.arc
ORA-00280: change
310139 for thread 1 is in sequence #3
Specify log:
{<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot
open archived log
'/fs01/oradata/edisipd/archive/edisipd_/1_724685163_3.arc'
ORA-27037: unable
to obtain file status
Linux-x86_64
Error: 2: No such file or directory
Additional
information: 3
This again is looking for sequence# 3. So something is not
right. But what is not right?
Then I checked manually myself with the backup pieces on
both the servers, whether both the servers have equal number of files and their
sizes. They are perfect.
Now I logged into primary database RMAN to see how many
pieces have been produced while taking backup. 14 pieces have been produced and
I have 14 pieces on the physical file system. So still I couldn't find what is
wrong.
Now, once again I counted the number of pieces produced and
got this..
RMAN> list
backup tag ‘FORSTANDBY’;
..
..
..
57
Full 5163187056 06-AUG-12
/fs04/oradata/edisipd/edisipd_EDIAPP_INDEX_12.dbf
69
Full 5163187056 06-AUG-12 /fs05/oradata/edisipd/edisipd_EDIAPP_INDEX_24.dbf
77
Full 5163187056 06-AUG-12
/fs06/oradata/edisipd/edisipd_EDIADT_DATA_02.dbf
BS Key Type LV
Size Device Type Elapsed Time
Completion Time
------- ---- --
---------- ----------- ------------ ---------------
31223 Full
3.16G DISK 00:28:35 06-AUG-12
BP Key: 31464 Status: EXPIRED Compressed: YES Tag: FORSTANDBY
Piece Name:
/backups/stbkp/ForStandby_sknhuobu_1_1
List of Datafiles in backup set 31223
File LV
Type Ckp SCN Ckp Time Name
---- -- ---- ---------- --------- ----
23
Full 5163317451 06-AUG-12
/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_18.dbf
25
Full 5163317451 06-AUG-12
/fs02/oradata/edisipd/edisipd_EDIAPP_DATA_20.dbf
54
Full 5163317451 06-AUG-12
/fs04/oradata/edisipd/edisipd_EDIAPP_INDEX_09.dbf
64
Full 5163317451 06-AUG-12
/fs05/oradata/edisipd/edisipd_EDIAPP_INDEX_19.dbf
BS Key Type LV
Size Device Type Elapsed Time
Completion Time
------- ---- --
---------- ----------- ------------ ---------------
31224 Full
4.03G DISK 00:45:33 06-AUG-12
BP Key: 31465 Status: EXPIRED Compressed: YES Tag: FORSTANDBY
Piece Name:
/backups/stbkp/ForStandby_sinhunrb_1_1
List of Datafiles in backup set 31224
File LV Type Ckp SCN Ckp Time
Name
---- -- ---- ---------- --------- ----
..
..
..
Some of the backup pieces were in EXPIRED
status. I now remember my colleague called me today morning to inform that she
moved some of the backup pieces to other file system and issued the
"crosscheck backup" RMAN command when the files where deleted from
primary server after transfer to standby site. After this the backup pieces
were restored to the file system so the files were not cataloged to the control
file.
The reason why “create datafile” is issued by RMAN is now
clear. Controlfile has the details about the datafiles but the backup pieces
were missing to restore/recover. Hence it creates the datafiles on the given
name and it should start recover from the beginning and hence looking for
sequence# 3. My confusion is now at a halt. J
Now I crosschecked the pieces once again using RMAN
crosscheck command so that these pieces were cataloged to the control file.
Then I started the cloning for standby which ran perfect as expected.
For the detailed steps on creating a standby database using RMAN
please check the below link.
Thank you!!
This is a user error.
ReplyDeletecrosscheck did what it expected to.
Venky,
ReplyDeleteI know this is an user error. But this can show lights to someone who begins on backup and recovery, just this can be as a case study. Thank you for going through my blog post and adding this as an user error. Valid point as to say!!