Oracle tips and techs: APPLIED column on v$archived_log when Standby database is setup

We use a simple concept to check for any standby gap or logs apply problems in all the high availability (Dataguard) servers in our environment. Use the below query in a script and make it run from cron periodically so we don’t have to go and check manually all the time.

spool /home/oracle/dba_scripts/logs/standby_db_sync_err.txt

select PRIM-SEC from

(select MAX(SEQUENCE#) PRIM FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES'),

(select MAX(SEQUENCE#) SEC FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES');

spool off

Today I received log applied gaps error alert mail stating we have a log applied gap of 24 logs. My script runs every hour and sends the alert mail only when the log applied gap crosses 10 logs since I have delay parameter set in the database.

Note: I used the term “log applied gaps” as I’m meaning that the term is used to differentiate the logs applied on primary with logs applied on standby database.

I first thought that there might be some activity going on in the database and the gap will catch up soon as this happens most of the time, but now the gaps increased to 35, the next hour. So something is wrong , and I should check it.

I used the following query on the standby database which gave the below.

SQL> select sequence#, process, status from v$managed_standby;

SEQUENCE# PROCESS STATUS

---------- --------- ------------

0 ARCH CONNECTED

65999 MRP0 WAIT_FOR_LOG

66007 RFS IDLE

66008 RFS IDLE

66009 RFS IDLE

This is obvious, RFS gets the logs as and when generated at primary and MRP waits until the “delay” time reaches before it applies. The gap is not much but only 10 logs. Then I thought why the script gives wrong information?

I ran the script manually to check and still receive the same alert mail of log applied gap 35. So I ran the query separately in primary database to check/troubleshoot what is going on. I got the below response from the primary database.

SQL> select MAX(SEQUENCE#) PRIM FROM V$ARCHIVED_LOG WHERE DEST_ID=1 AND ARCHIVED='YES';

PRIM

----------

66010

SQL> select MAX(SEQUENCE#) SEC FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES';

SEC

----------

65975

Ok, I just wait for some time and check the same again but to see the same result. (Many a times, in standby environment I get the results delayed. J )

Now, it’s time to dig Oracle support documents to find what is really going on. I thought it might be some sort of bug as I haven’t faced such an issue before. As my prediction it turns to be a bug.

The MOS article “APPLIED - Column not updated if Heartbeat-ARCH hangs [ID 1369630.1]” states the issue we are dealing with is due to the bug “Bug 6113783 - Arch processes can hang indefinitely on network [ID 6113783.8]”.

The bug is fixed in 11.2.0.1 base release and I’m currently running my database in 10.2.0.4.

There are 2 options provided.

Upgrade DB to 11.2.0.1 or higher
Use the work around of killing the hung ARCH process in the primary database.

Option 2 looked feasible at this time as upgrading of the DB requires a lot of approvals, application compatibility, etc., etc.,.

So how to find the hung ARCH process?

MOS says Look for Messages like

"ARCn: Becoming the heartbeat ARCH"

in the ALERT.LOG of the Primary Database to determine the current Heartbeat ARCn-Process.

So I searched my alert log file to find the below.

Fri Aug 17 05:04:08 2012

ARC2: Becoming the heartbeat ARCH

ARC3 started with pid=103, OS id=28106

Fri Aug 17 05:04:08 2012

Check for the OS process id for ARC2

[oracle@slv00112 bdump]$ ps -ef|grep ora_arc

oracle 32224 1 0 2012 ? 02:43:53 ora_arc0_edisipd

oracle 32226 1 0 2012 ? 02:21:49 ora_arc1_edisipd

oracle 32232 1 0 2012 ? 02:41:07 ora_arc2_edisipd

oracle 32234 1 0 2012 ? 02:26:41 ora_arc3_edisipd

oracle 23429 16642 0 04:51 pts/1 00:00:00 grep ora_arc

Kill the Hung ARCH process

[oracle@slv00112 bdump]$ kill -9 32232

Now check the alert log file again to find the below.

ARCH: Detected ARCH process failure

ARCH: STARTING ARCH PROCESSES

ARC2: Archival started

ARCH: STARTING ARCH PROCESSES COMPLETE

ARC2 started with pid=67, OS id=23425

ARC2: Becoming the heartbeat ARCH

Wed Jan 16 04:53:47 2013

Ok, now the new process has been generated. Hence check with the primary database whether the APPLIED column is updated.

SQL> select MAX(SEQUENCE#) SEC FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND APPLIED='YES';

SEC

----------

66003

Now it has been updated successfully. J

4 comments:

AnonymousAugust 30, 2013 at 7:50 AM
It was a great article, helped me in resolving the most frequently occured issue.

~Sandy
AnonymousAugust 30, 2013 at 9:35 AM
This is excellent post. Appreciated for sharing such real time issue and solution

I don't know what you think about my creation, but every words of you help me grow better and stronger!!

Wednesday, January 16, 2013

APPLIED column on v$archived_log when Standby database is setup

4 comments: