Monitoring an Oracle Physical Standby
When you are using an Oracle dataguard physical standby as your disaster recovery solution you want to make sure the standby is as up-to-date as possible with the redo being generated on the primary database. While there is an argument for not actually applying that redo straight away (recovering from human error being one) , you do want to make sure your standby has actually received it. There are quite a few views you can look at to see how your standby is doing.
This one checks which redo log sequence number you have applied up to:
STANDBY_SQL> SELECT MAX(SEQUENCE#), APPLIED FROM V$ARCHIVED_LOG GROUP BY APPLIED; MAX(SEQUENCE#) APP -------------- ---- 1129 YES
This query will tell you what the background processes that make your standby tick are actually up to:
STANDBY_SQL> select process, client_process, sequence#, status from V$managed_standby; PROCESS CLIENT_P SEQUENCE# STATUS --------- -------- ---------- ------------ ARCH ARCH 0 CONNECTED ARCH ARCH 0 CONNECTED RFS UNKNOWN 0 IDLE RFS UNKNOWN 0 IDLE MRP0 N/A 774 WAIT_FOR_LOG RFS UNKNOWN 0 IDLE RFS UNKNOWN 0 IDLE RFS UNKNOWN 0 IDLE RFS LGWR 774 IDLE RFS LGWR 236 IDLE RFS UNKNOWN 0 IDLE
What you don’t want to see in the above is WAIT_FOR_GAP, because then you know you have hit a problem.
You can tell which recovery mode you are in via the following:
STANDBY_SQL> SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2 ; RECOVERY_MODE ----------------------- MANAGED REAL TIME APPLY
The view, V$dataguard_status is quite useful in highlighting any issues/problems the standby may be having, but I find the V$dataguard_stats to be the best view for telling you where you have applied up to:
SQL> select * from v$dataguard_stats; NAME VALUE -------------------------------- ---------------- apply finish time +00 00:00:00.0 apply lag +00 00:00:13 estimated startup time 24 standby has been open N transport lag +00 00:00:05
What this tell us, is that we are seeing the redo 5 seconds after it has hit the primary, we are using Maximum performance on the primary, I’d expect this to be 0 with Maximum Availability or Maximum Protection. You can also see from this view how good your standby is at keeping up with the redo generated with the apply lag. The apply lag can be very useful for monitoring, if you have a gap the transport lag can still be quite low but your apply lag can just grow and grow, if you are missing an archived redo logfile and it’s deleted on your primary you can end up in a non-recoverable situation again rendering your standby not much use.
The final thing I’d advise monitoring is the mrp background process, if this dies then again you can get a gap building up that would not be bridged by a fal (fetch archive log) request as the mrp process is responsible for detecting and resolving gaps. The mrp process looks something like ora_mrp0_SID.

(1 votes, average: 4 out of 5)
August 16th, 2007 at 8:39 pm
Useful link to know whats going on with DataGuard..
Thanks
July 15th, 2008 at 1:58 pm
[…] your Standby with V$RECOVERY_PROGRESS I have blogged previously about various ways you can monitor the progress of your physical standby, and I have now come across another way of […]