Saturday, February 17, 2018

Impact of doing a flashback on data guard environment


In very rare case, We may end up in situation like some logical mistake has been happened to our production database and as a last resort to recover the data we may need to do a flashback operation at primary database. When we do a flashback operation on primary database subsequently redo apply will get stopped out on all standby databases.In this demonstration I will show how to bring our standby back in sync with our production database. 

Here SRCDB is my primary database and TRGDB is my standby database. Flashback should be enabled on both primary and standby database.

C:\Users\Mahi>set ORACLE_SID=SRCDB
C:\Users\Mahi>sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 19:38:05 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>select name ,open_mode, database_role,flashback_on from v$database;

NAME      OPEN_MODE            DATABASE_ROLE    FLASHBACK_ON
--------- -------------------- ---------------- ------------------
SRCDB     READ WRITE           PRIMARY          YES


SQL> select systimestamp from dual;

SYSTIMESTAMP
---------------------------------------------------------------------------
16-FEB-18 07.38.37.868000 PM +05:30


SQL> select count(*) from hr.JOBS;

  COUNT(*)
----------
        19

C:\Users\Mahi>sqlplus sys@trgdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 19:41:04 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select name ,open_mode, database_role,flashback_on from v$database;

NAME      OPEN_MODE            DATABASE_ROLE    FLASHBACK_ON
--------- -------------------- ---------------- ------------------
SRCDB     READ ONLY WITH APPLY PHYSICAL STANDBY YES


SQL> select count(*) from hr.JOBS;

  COUNT(*)
----------
        19
SQL>

Let's do some logical mistake-

C:\Users\Mahi>sqlplus sys@srcdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 19:45:20 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> drop user hr cascade ;

User dropped.

SQL>

As I have configured "Real Time Apply" the logical mistake that I have done on my primary database should have applied to my standby database as well.I will verify that on my standby.

C:\Users\Mahi>sqlplus sys@trgdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 19:46:18 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select count(*) from hr.JOBS;
select count(*) from hr.JOBS
                        *
ERROR at line 1:
ORA-00942: table or view does not exist

SQL> select process,status,sequence#,block#,blocks from v$managed_standby;

PROCESS   STATUS        SEQUENCE#     BLOCK#     BLOCKS
--------- ------------ ---------- ---------- ----------
ARCH      CONNECTED             0          0          0
ARCH      CONNECTED             0          0          0
ARCH      CLOSING              36       4096       1522
ARCH      CONNECTED             0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
MRP0      APPLYING_LOG         37       3982     102400
RFS       IDLE                 37       3982          1

9 rows selected.

SQL>

Now shutdown the primary database and do the flashback operation to recover the dropped schema. 
C:\Users\Mahi>sqlplus sys@srcdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 20:31:23 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
SQL> shut immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>

C:\Users\Mahi>set ORACLE_SID=SRCDB
C:\Users\Mahi>sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 20:38:20 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  534462464 bytes
Fixed Size                  2404704 bytes
Variable Size             440405664 bytes
Database Buffers           83886080 bytes
Redo Buffers                7766016 bytes
Database mounted.
SQL>
SQL> FLASHBACK DATABASE TO TIMESTAMP to_timestamp('16-02-18 19:38:37', 'DD-MM-YY HH24:MI:SS');

Flashback complete.

SQL> alter database open resetlogs;

Database altered.

SQL>
SQL> select count(*) from hr.JOBS;

  COUNT(*)
----------
        19

SQL>

Once you do the flashback operation on the primary database, subsequently all standby databases will no longer able to apply the redo because my primary database has gone through a new incarnation.I Verified MRP process is no longer running in standby.

C:\Users\Mahi>sqlplus sys@trgdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 20:48:51 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>  select process,status,sequence#,block#,blocks from v$managed_standby;

PROCESS   STATUS        SEQUENCE#     BLOCK#     BLOCKS
--------- ------------ ---------- ---------- ----------
ARCH      CONNECTED             0          0          0
ARCH      CONNECTED             0          0          0
ARCH      CLOSING               1          1          1
ARCH      CLOSING              37      10240       1726
RFS       IDLE                  2       1218          1
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0

8 rows selected.

SQL>

Output from standby alertlog file-

RFS[6]: New Archival REDO Branch(resetlogs_id): 968273102  Prior: 967810818
RFS[6]: Archival Activation ID: 0x23e32516 Current: 0x23dcca7c
RFS[6]: Effect of primary database OPEN RESETLOGS
RFS[6]: Managed Standby Recovery process is active
2018-02-16 20:46:25.742000 +05:30
Setting recovery target incarnation to 3
Archived Log entry 27 added for thread 1 sequence 37 ID 0x23dcca7c dest 1:
2018-02-16 20:46:27.820000 +05:30
Archived Log entry 28 added for thread 1 sequence 1 ID 0x23e32516 dest 1:
Media Recovery Waiting for thread 1 sequence 38
MRP0: Incarnation has changed! Retry recovery...
Errors in file E:\APP\MAHI\diag\rdbms\trgdb\trgdb\trace\trgdb_pr00_7600.trc:
ORA-19906: recovery target incarnation changed during recovery
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
2018-02-16 20:46:30.078000 +05:30
 Started logmerger process
Managed Standby Recovery starting Real Time Apply
Warning: Recovery target destination is in a sibling branch
of the controlfile checkpoint. Recovery will only recover
changes to datafiles.
Datafile 1 (ckpscn 2718127) is orphaned on incarnation#=2
MRP0: Detected orphaned datafiles!
2018-02-16 20:46:31.113000 +05:30
Recovery will possibly be retried after flashback...
Errors in file E:\APP\MAHI\diag\rdbms\trgdb\trgdb\trace\trgdb_pr00_7416.trc:
ORA-19909: datafile 1 belongs to an orphan incarnation
ORA-01110: data file 1: 'E:\ORADATA\TRGDB\DATAFILE\O1_MF_SYSTEM_F80M3JN4_.DBF'
Managed Standby Recovery not using Real Time Apply
2018-02-16 20:46:32.375000 +05:30
Recovery Slave PR00 previously exited with exception 19909

Now flashback the standby database to the present time of the primary database. 

C:\Users\Mahi>sqlplus sys@srcdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 20:59:52 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select resetlogs_change# from v$database;

RESETLOGS_CHANGE#
-----------------
          2709581

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

C:\Users\Mahi>sqlplus sys@trgdb as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Fri Feb 16 21:00:22 2018
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL>  flashback database to scn 2709581;

Flashback complete.

SQL>

Note:- It is better to use a SCN substracted by 1 or 2 from the resetlogs_change# that we noted from the primary database.So we can
make sure that we get the Standby close before the present time of the Primary.

Output from standby alertlog file -

flashback database to scn 2709581
Stopping Emon pool
2018-02-16 21:01:16.646000 +05:30
Waiting for shared server 'S000' to die
All dispatchers and shared servers shutdown
2018-02-16 21:01:17.740000 +05:30
CLOSE: killing server sessions.
2018-02-16 21:01:34.966000 +05:30
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
2018-02-16 21:01:36.006000 +05:30
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
Active process 2644 user 'SYSTEM' program 'ORACLE.EXE (SHAD)'
CLOSE: all sessions shutdown successfully.
Stopping Emon pool
ARC1: Waiting for instance close to complete
SMON: disabling cache recovery
2018-02-16 21:01:39.226000 +05:30
ARC1: Wait for instance close completed
2018-02-16 21:01:41.779000 +05:30
Flashback Restore Start
2018-02-16 21:01:46.660000 +05:30
Flashback Restore Complete
Flashback Media Recovery Start
2018-02-16 21:01:47.718000 +05:30
 Started logmerger process
2018-02-16 21:01:48.983000 +05:30
Parallel Media Recovery started with 4 slaves
Media Recovery start incarnation depth : 1, target inc# : 3, irscn : 2709580
2018-02-16 21:01:50.465000 +05:30
Flashback Media Recovery Log E:\ARCHIVE\TRGDB\TRGDB_1_36_967810818.ARC
2018-02-16 21:01:52.447000 +05:30
Flashback Media Recovery Log E:\ARCHIVE\TRGDB\TRGDB_1_37_967810818.ARC
2018-02-16 21:01:54.697000 +05:30
Flashback Media Recovery Log E:\ARCHIVE\TRGDB\TRGDB_1_1_968273102.ARC
Incomplete Recovery applied until change 2709582 time 02/16/2018 20:45:28
Flashback Media Recovery Complete
2018-02-16 21:01:58.676000 +05:30
Completed:  flashback database to scn 2709581

Now we need to start managed recovery process-

SQL> recover managed standby database disconnect;
Media recovery complete.
SQL> select process,status,sequence#,block#,blocks from v$managed_standby;

PROCESS   STATUS        SEQUENCE#     BLOCK#     BLOCKS
--------- ------------ ---------- ---------- ----------
ARCH      CONNECTED             0          0          0
ARCH      CONNECTED             0          0          0
ARCH      CLOSING               2       6144         47
ARCH      CLOSING              37      10240       1726
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
MRP0      WAIT_FOR_LOG          3          0          0
RFS       IDLE                  0          0          0

8 rows selected.

SQL> select count(*) from hr.JOBS;
select count(*) from hr.JOBS
                        *
ERROR at line 1:
ORA-01219: database or pluggable database not open: queries allowed on fixed
tables or views only

SQL> recover managed standby database cancel ;
Media recovery complete.
SQL> alter database open read only;

Database altered.

SQL> recover managed standby database disconnect;
Media recovery complete.
SQL> select process,status,sequence#,block#,blocks from v$managed_standby;

PROCESS   STATUS        SEQUENCE#     BLOCK#     BLOCKS
--------- ------------ ---------- ---------- ----------
ARCH      CLOSING               3          1        139
ARCH      CONNECTED             0          0          0
ARCH      CLOSING               4          1          5
ARCH      CLOSING              37      10240       1726
RFS       IDLE                  0          0          0
RFS       IDLE                  0          0          0
MRP0      APPLYING_LOG          5       1379     102400
RFS       IDLE                  5       1379          1
RFS       IDLE                  0          0          0

9 rows selected.

SQL> select count(*) from hr.JOBS;

  COUNT(*)
----------
        19
SQL>

My tables is back in stanby and is in sync with primary database too.

Ref:- https://uhesse.com/2010/08/06/using-flashback-in-a-data-guard-environment/

Thursday, February 15, 2018

Why we need standby redo log file?


If you configure your standby for Maximum Protection, then Standby Redo Logs are required. Most implementations are configured for Maximum Performance because they do not want the performance hit Max Protect may impart on their application. Even if you are using Max Performance, you still want to implement SRLs. To understand why, we first need to start by examining how redo transport works when SRLs do not exist.  We’ll start by exploring the diagram below. 

Redo_Transport_WO_SRLs.jpg



In the system above, SRLs are not configured on the standby database. The arrows show how redo transport flows through the system. Redo travels along this path:

  1. A transaction writes redo records into the Log Buffer in the System Global Area (SGA).
  2. The Log Writer process (LGWR) writes redo records from the Log Buffer to the Online Redo Logs (ORLs).
  3. When the ORL switches to the next log sequence (normally when the ORL fills up), the Archiver process (ARC0) will copy the ORL to the Archived Redo Log.
  4. Because a standby database exists, a second Archiver process (ARC1) will read from a completed Archived Redo Log and transmit the redo over the network to the Remote File Server (RFS) process running for the standby instance.
  5. RFS sends the redo stream to the local Archiver process (ARCn).
  6. ARCn then writes the redo to the archived redo log location on the standby server.
  7. Once the archived redo log is completed, the Managed Recovery Process (MRP0) sends the redo to the standby instance for applying the transaction.

The picture is more complex when we have Standby Redo Logs in place, as can be seen below. 

Redo_Transport_With_SRLs.jpg



With SRLs, not only do we have more items in the picture, we also have different choices, i.e. different paths to get from the primary to the standby. The first choice is to decide if we are configured for Max Protect or Max Performance as I will discuss its impact below.

  1. Just like without SRLs, a transaction generates redo in the Log Buffer in the SGA.
  2. The LGWR process writes the redo to the ORL.
  3. Are we in Max Protect/Max availability or Max Performance mode?
    1. If Max Protect/availability, then we are performing SYNC redo transport. The Network Server SYNC process (NSSn) is a slave process to LGWR. It ships redo to the RFS process on the standby server.
    2. If Max Performance mode, then we are performing ASYNC redo transport. The Network Server ASYNC process (NSAn) reads from the ORL and transports the redo to the RFS process on the standby server.
  4. RFS on the standby server simply writes the redo stream directly to the SRLs.
  5. How the redo gets applied depends if we are using Real Time Apply or not.
    1. If we are using Real Time Apply, MRP0 will read directly from the SRLs and apply the redo to the standby database.
    2. If we are not using Real Time Apply, MRP0 will wait for the SRL’s contents to be archived and then once archived and once the defined delay has elapsed, MRP0 will apply the redo to the standby database.

Step 3 above is the entire reason we want to use Standby Redo Logs. If we are in Max Protect (SYNC) mode, then SRLs are required otherwise this process will not work. If we are in Max Performance mode, will still want SRLs. Why? We want SRLs to be configured, even in Max Performance mode because they reduce data loss to seconds, rather than minutes or hours. Max Performance mode with SRLs often achieves a near-zero data loss solution. Take a minute to go back and re-read the underlined portions again. The underlined passage above is why you want to configure SRLs if you are in Max Performance mode.  The other big benefit to SRLs is when Real Time Apply is being performed. As soon as the redo is in the SRL, it can be replayed on the standby database. We do not have to wait for a log switch to occur. Real Time Apply, only possible with SRLs, means the recovery time to open the stnadby database in a failover operation is as low as it can be.

It is often find that people are operating under the misconception that if you configure for ASYNC, configure for Max Performance, then only ARCn can transport redo from the primary to the standby. This used to be true in much older versions, but in 10g (maybe 9i), ARCn is only used to transport redo only if SRLs are not configured.Also sometimes if the communication between primary and standby are interrupted(standby down for a while)ARCH process will send the archived redo logs to RFS process directly to resolve the archive log gap. If SRLs are configured, then for ASYNC, NSAn is used to transport redo. Furthermore, NSAn does this in near real time. 

Note: The NSSn and NSAn processes are new to Oracle 12c. In prior versions, a singular process, LNS performed this job.

Without SRLs, we must wait for a log switch to occur on the primary before the redo can be transported. If it takes one hour for the log switch to occur, then we can have one hour’s worth of data loss. If it takes six hours for that log switch to occur, then I can have six hour’s worth of data loss. This behavior was improved by implementing the ARCHIVE_LAG_TARGET initialization parameter in their primary configuration. If the DBA set this parameter to 3600 seconds, then a log switch would occur at most once per hour. Even with this parameter, one hour of data loss may seem like a lot to most companies, especially when you do better. 

Soource:- https://community.oracle.com/docs/DOC-1007036