- 2-node RAC configuration
- Linux System
- Single instance Dataguard Setup but with ASM Grid.
Note: If you have 2-node RAC setup on both primary and standby side or higher node setup, then it doesn’t matter whether you shut down all the nodes or not, it will terminate rest of instances if you execute switchover command. Later, you can start these instances. But as per Oracle recommendation, one node instance from Primary and one node from standby server should be down.
Step 1: Please perform below prerequisites before initiating switchover operations:
- Check whether both primary and standby databases are in sync.
- Check whether you can access both primary and standby databases remotely via sqlplus connectivity.
- Check tnsping connectivity is working fine from both primary and standby database servers.
- Check Telnet connectivity is working fine.
- Check for any critical ORA- error in alert log file of both Primary and Standby.
- Check all Cluster Services are up and running fine.
- Check primary and standby destination status are valid and check for any error for destination if any.
- Switch 4-5 logs from primary to standby server to verify whether logs are getting applied on standby database successfully or not.
Step 2: Confirm with an Application Developer whether he has stopped all
Application Services connecting to the respective database. Also check any
remote connection to the database from non-oracle users.
Step 3: Inform an Application Developer and Client 15 minutes prior the
activity so that we can start switchover activity in next 15 minutes.
Step 4: As per Oracle Recommendation, all primary and standby instances
should be down except one from primary and one from standby.
For Example:
You have 5 instances on Primary
and 5 instances on standby server. You need to shut down 4 instances from
primary and 4 instances from standby, except 1st on primary and 1st on standby where you have to perform the activity.
Once the switchover activity
is done, then you can start all instances (Primary and Standby) which had been shut down previously.
In my configuration, I will
shut down node2 instance only. In standby server, I have single instance setup
and hence I am not shutting down any instance on standby.
Note:
In short, one primary and one standby database must be up and remaining must
be down prior the switchover activity.
Step 5: Shutdown primary instance on node 2(rac2).
If you will not shut down the database, later Oracle will shut it down
automatically once switchover command gets executed.
Run below command on any node by oracle:
$srvctl stop instance -d dc
-i dc2
Step 6: Enable flashback and
create guaranteed restore point in both standby and primary databases. Ensure
you have set flash recovery area in ASM disk group otherwise you will get error
while opening the database (ORA- unable to obtain the file status).
On Standby:
SQL> select flashback_on
from v$database;
SQL> show parameter
recovery
SQL> alter system set
db_recovery_file_dest=’+DATA’ scope=spfile;
SQL> alter system set
db_recovery_file_dest_size=2g scope=spfile;
SQL> shut immediate;
SQL> startup mount;
SQL> alter database
flashback on;
SQL> create restore point
DR_DRILL guarantee flashback database;
SQL> select flashback_on
from v$database;
On Primary:
SQL> show parameter
recovery
SQL> alter system set
db_recovery_file_dest=’+DATA’ scope=spfile;
SQL> alter system set
db_recovery_file_dest_size=2g scope=spfile;
SQL> select flashback_on
from v$database;
SQL> shut immediate;
SQL> startup mount;
SQL> alter database
flashback on;
SQL> create restore point
guarantee flashback database;
SQL> select flashback_on
from v$database;
Switchover Initiation:
Step 7: Initiate switchover activity.
On Node1(rac1), by oracle
user:
Primary:
SQL> select name,database_role,controlfile_type,open_mode
from gv$database;
SQL> select thread#,max(sequence#) from v$archived_log group by
thread#;
Standby:
SQL> select thread#,max(sequence#) from v$archived_log where
applied='YES' group by thread#;
SQL> select process,status,sequence# from v$managed_standby;
SQL> select THREAD#,STATUS,ENABLED,SEQUENCE# from v$thread order by 1;
THREAD# STATUS ENABLED SEQUENCE#
---------- ------ -------- ----------
1 OPEN
PUBLIC 270
2 OPEN
PRIVATE 144
SQL> select thread#,max(sequence#) from v$archived_log group by
thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 259
2 141
On Primary:
SQL>
alter database commit to switchover to physical standby with session shutdown
nowait;
Database altered.
The above command will terminate all sessions and secondary nodes, commit
the transactions which have committed by users, and perform the switchover
operation.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664
bytes
Variable Size 1056967840
bytes
Database Buffers 536870912
bytes
Redo Buffers 7319552
bytes
Database mounted.
SQL>
SQL> recover managed standby database disconnect from session;
Media recovery complete.
Primary
database Alert Logfile:
----------------------------------
Tue Sep 19 21:16:36 2017
alter database commit to switchover to physical standby with session
shutdown nowait
ALTER
DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 21721] (dc1)
Switchover End-Of-Redo Log thread 1 sequence 260 has been fixed
Switchover End-Of-Redo Log thread 2 sequence 143 has been fixed
Switchover: Primary highest seen SCN set to 0x0.0x2c8764
ARCH: Noswitch archival of thread 1, sequence 260
ARCH: End-Of-Redo Branch archival of thread 1 sequence 260
Tue Sep 19 21:16:38 2017
Process (ospid 21609) is suspended due to switchover to physical standby
operation.
ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARCH: Standby redo logfile selected for thread 1 sequence 260 for destination
LOG_ARCHIVE_DEST_2
Archived Log entry 202 added for thread 1 sequence 260 ID 0x79c6d550 dest
1:
ARCH: Noswitch archival of thread 2, sequence 143
ARCH: End-Of-Redo Branch archival of thread 2 sequence 143
ARCH: LGWR is actively archiving destination LOG_ARCHIVE_DEST_2
ARCH: Standby redo logfile selected for thread 2 sequence 143 for
destination LOG_ARCHIVE_DEST_2
Archived Log entry 204 added for thread 2 sequence 143 ID 0x79c6d550 dest
1:
ARCH: Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received alls redo
Final check for a synchronized target standby. Check will be made once.
LOG_ARCHIVE_DEST_2 is a potential Physical Standby switchover target
Active, synchronized target has been identified
Target has also received all redo
Backup
controlfile written to trace file
/u01/app/oracle/diag/rdbms/dc/dc1/trace/dc1_ora_21721.trc
Clearing standby activation ID 2043073872 (0x79c6d550)
The primary database controlfile was created using the
'MAXLOGFILES 192' clause.
There is space for up to 188 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER
DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER
DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER
DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER
DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 209715200;
ALTER
DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 209715200;
ALTER
DATABASE ADD STANDBY LOGFILE 'srl6.f' SIZE 209715200;
Archivelog for thread 1 sequence 260 required for standby recovery
Archivelog for thread 2 sequence 143 required for standby recovery
Switchover:
Primary controlfile converted to standby controlfile succesfully.
Tue Sep 19 21:16:43 2017
Reconfiguration started (old inc 4, new inc 6)
List of instances:
1 (myinst: 1)
Global Resource Directory frozen
* dead instance detected - domain
0 invalid = TRUE
Communication channels
reestablished
Master broadcasted resource hash
value bitmaps
Non-local Process blocks cleaned
out
Tue Sep 19 21:16:43 2017
LMS 0: 0 GCS shadows cancelled, 0
closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue
requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Tue Sep 19 21:16:43 2017
Instance recovery: looking for dead threads
Process (ospid 21601) is suspended due to switchover to physical standby
operation.
Tue Sep 19 21:16:43 2017
Process (ospid 21599) is suspended due to switchover to physical standby
operation.
Switchover: Complete - Database shutdown required
USER (ospid: 21721): terminating the instance
Instance terminated by USER, pid = 21721
Completed:
alter database commit to switchover to physical standby with session shutdown
nowait
Shutting
down instance (abort)
License high water mark = 4
Tue Sep 19 21:16:44 2017
On Standby:
Initiate switchover on standby database.
SQL> select THREAD#,STATUS,ENABLED,SEQUENCE# from v$thread order by 1;
THREAD# STATUS ENABLED SEQUENCE#
---------- ------ -------- ----------
1 OPEN
PUBLIC 270
2 CLOSED PRIVATE 144
SQL> select
thread#,max(sequence#) from v$archived_log where applied='YES' group by
thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 259
2 141
SQL> recover managed standby database cancel;
Media recovery complete.
SQL>
alter database commit to switchover to primary with session shutdown nowait;
Database altered.
SQL> startup;
ORA-01081: cannot start already-running ORACLE - shut it down first
SQL> shut immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664 bytes
Variable Size 989858976 bytes
Database Buffers 603979776 bytes
Redo Buffers 7319552 bytes
Database mounted.
Database opened.
Standby
Database Alert Logfile:
------------------------------------
Tue Sep 19 21:17:48 2017
MRP0: Background Media Recovery cancelled with status 16037
Errors in file
/u01/app/oracle/diag/rdbms/dr/dc/trace/dc_mrp0_6511.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
MRP0: Background Media Recovery process shutdown (dc)
Managed Standby Recovery Canceled (dc)
Completed: ALTER DATABASE RECOVER
managed standby database cancel
Tue Sep 19 21:17:52 2017
alter
database commit to switchover to primary with session shutdown nowait
ALTER DATABASE SWITCHOVER TO PRIMARY (dc)
No wait specified for role transition.
Backup controlfile written to trace file
/u01/app/oracle/diag/rdbms/dr/dc/trace/dc_ora_6318.trc
SwitchOver after complete recovery through change 2918244
Online log +DATA/dr/onlinelog/group_1.278.954856009: Thread 1 Group 1 was
previously cleared
Online log +DATA/dr/onlinelog/group_2.277.954856011: Thread 1 Group 2 was
previously cleared
Online log +DATA/dr/onlinelog/group_3.276.954856019: Thread 2 Group 3 was
previously cleared
Online log +DATA/dr/onlinelog/group_4.275.954856039: Thread 2 Group 4 was
previously cleared
Standby
became primary SCN: 2918242
Switchover: Complete - Database mounted as primary
Completed:
alter database commit to switchover to primary with session shutdown nowait
On 2nd
Node:
Once the switchover gets completed, execute below commands on 2nd
node of RAC:
SQL> select THREAD#,STATUS,ENABLED,SEQUENCE# from v$thread order by 1;
THREAD# STATUS ENABLED SEQUENCE#
---------- ------ -------- ----------
1 OPEN
PUBLIC 270
2 OPEN
PRIVATE 144
SQL> select thread#,max(sequence#) from v$archived_log group by
thread#;
THREAD# MAX(SEQUENCE#)
---------- --------------
1 259
2 142
SQL> SQL> /
select thread#,max(sequence#) from v$archived_log group by thread#
*
ERROR at line 1:
ORA-03135:
connection lost contact
Process ID: 10985
Session ID: 1 Serial number: 5
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release
11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage
Management, Oracle Label Security,
OLAP, Data Mining, Oracle Database Vault and Real Application Testing
options
[oracle@rac2 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Tue Sep 19 21:41:56 2017
Copyright (c) 1982, 2013, Oracle.
All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664 bytes
Variable Size 1056967840
bytes
Database Buffers 536870912 bytes
Redo Buffers 7319552 bytes
Database mounted.
2nd Node
Alert Logfile:
-------------------------------
LNS: Standby redo logfile selected for thread 2 sequence 143 for
destination LOG_ARCHIVE_DEST_2
Tue Sep 19 21:16:37 2017
Process (ospid 10860) is suspended due to switchover to physical standby
operation.
Tue Sep 19 21:16:40 2017
Switchover in progress in another database instance - Database is
shutdown automatically
LGWR
(ospid: 10841): terminating the instance due to error 16456
Instance terminated by LGWR, pid = 10841
Tue Sep 19 21:42:00 2017
Step 8: Ask an Application Developer about the time till application will
be running on standby side. Once you get the confirmation from him, you can the follow below
steps to perform switchback operation.
Note: Now your old primary became
new standby and vice-versa.
Switchback Initiation:
Step 9: Please perform the prerequisites before initiating switchback operations:
- Check whether both current primary and current standby databases are in sync.
- Check whether you can access both current primary and current standby databases remotely via sqlplus connectivity.
- Check tnsping connectivity is working fine from both current primary and current standby database servers.
- Check Telnet connectivity is working fine.
- Check for any critical ORA- error in alert log file of both current Primary and current Standby.
- Check all Cluster Services are up and running fine.
- Check current primary and current standby destination status are valid and check for any error for destination if any.
- Switch 4-5 logs from current primary to current standby server to verify whether logs are getting applied on standby database successfully or not.
Step 10: Confirm with an Application Developer whether he has stopped
all Application Services connecting to the respective database. Also check any
remote connection to the database from non-oracle users.
Step 11: Inform an Application Developer and Client 15 minutes prior the
activity that we will start switchback activity in next 15 minutes.
Step 12: As per Oracle Recommendation, all primary and standby instances
should be down except one from primary and one from standby.
Step 13: Initiate switchback activity.
New Current Primary:
SQL> select
name,database_role,controlfile_type,open_mode from gv$database;
SQL> select thread#,max(sequence#) from v$archived_log group by
thread#;
New Current Standby:
SQL> select thread#,max(sequence#) from v$archived_log where
applied='YES' group by thread#;
SQL> select process,status,sequence# from v$managed_standby;
SQL> select THREAD#,STATUS,ENABLED,SEQUENCE# from v$thread order by 1;
On New Primary:
SQL>
alter database commit to switchover to physical standby with session shutdown
nowait;
Database altered.
The above command will terminate all sessions and secondary nodes, commit
the transactions which have committed by users, and perform the switchover
operation.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664
bytes
Variable Size 1056967840
bytes
Database Buffers 536870912
bytes
Redo Buffers 7319552
bytes
Database mounted.
SQL>
SQL>
SQL> recover managed standby database disconnect from session;
Media recovery complete.
On New Standby:
SQL> select THREAD#,STATUS,ENABLED,SEQUENCE# from v$thread order by 1;
SQL> select
thread#,max(sequence#) from v$archived_log where applied='YES' group by
thread#;
SQL> recover managed standby database cancel;
Media recovery complete.
SQL>
alter database commit to switchover to primary with session shutdown nowait;
Database altered.
SQL> startup;
ORA-01081: cannot start already-running ORACLE - shut it down first
SQL> shut immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.
Total System Global Area 1603411968 bytes
Fixed Size 2253664 bytes
Variable Size 989858976 bytes
Database Buffers 603979776 bytes
Redo Buffers 7319552 bytes
Database mounted.
Database opened.
Step 14: Check everything is running fine.
- Check whether both primary and standby databases are in sync.
- Check whether you can access both primary and standby databases remotely via sqlplus connectivity.
- Check tnsping connectivity from both primary and standby database servers.
- Check Telnet connectivity whether Listener PORT is working fine.
- See any critical ORA- error in alert log file.
- Check all Cluster Services are up and running fine.
- Check primary and standby destination status are valid.
- Switch 2-3 logs from both primary to standby server to verify whether logs are getting applied on standby database successfully.
·
Step 15: Inform an Application Team that switchback activity is
completed successfully and you can start all services at an Application level.
Thanks for reading this post ! Please comment if you like the post.
Do you have step by step to configure DG on RAC System ?
ReplyDeleteIf yes, pls post ..
Thanks
Yes, i have it. I will post it soon.
DeleteGreat Job Man!
DeleteSuper,..but 2nd node down steps missed, I think it should be done by srvctl or direct sqlplus
DeleteThank you sir for your comment. Please see the step no 5. It is there. Yes, you have to stop 2nd node on both primary and standby side either by srvctl stop instance or sqlplus. I have 2node RAC configuration with single instance DR and hence I have not mentioned 2nd node down cmd for DR.
DeleteThanks for your valuable feedback.
Nice creation Rupesh
ReplyDeleteNice blog rupesh..very helpful
ReplyDeleteVery helpful Sir
ReplyDeleteGreat article !
ReplyDeleteGreat
ReplyDeleteNice Rupesh sir
ReplyDeleteVery much helpful and in detailed one
ReplyDeleteVery Useful
ReplyDelete