Tuesday, August 30, 2016

Restoring After Losing One Member of the Multiplexed redo log Group - in a OMF and NON-OMF system


Restoring After Losing One Member of the Multiplexed Group

If you are seeing an ORA-00312 in your alert log it should be because of one of the member of the redo log group is experiencing media failure. You should see lines similar to these near the bottom of your alert.log file:


Mon Aug 29 18:28:02 2016
Errors in file /u01/app/oracle/diag/rdbms/ttrec/TTREC1/trace/TTREC1_arc1_1752.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3


If your online redo log file members are multiplexed, the log writer(lgwr) will continue to function as long as it can successfully write to one member of the current log group. If the problem is temporary, then as soon as the online redo log file becomes available, the log writer will start to write to the online redo log file as if there was never an issue.

If the media failure is permanent then you’ll need to replace the disk and drop and re-create the bad member to its original location. If you don’t have the option of
replacing the bad disk, then you’ll need to drop the bad member and re-create it in an alternate location.

For permanent media failures, here are the instructions for dropping and re-creating one member of an online redo log group:

1. Identify the online redo log file experiencing media failure.
2. Ensure that the online redo log file is not part of the current online log group.
3. Drop the damaged member.
4. Add a new member to the group.

Here I am going to delete one of the multiplexed redo log file and later I will recreate it in the same location.

SQL>select member, a.group#, a.status, b.status from v$log a, v$logfile b where  a.group# = b.group#  order by a.group#, member;

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 ACTIVE
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 ACTIVE
/data1/TTREC/onlinelog/o1_mf_2_cvf5qv2z_.log                                    2 CURRENT
/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log                                    2 CURRENT
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 ACTIVE
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 ACTIVE

6 rows selected.

SQL>

I deleted one of the current redo log member.

[oracle@ehrsvpitoel02]:[TTREC1] $ rm /data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log

I did a log switch

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> exit

I could see following error message in my alert log file.

Mon Aug 29 18:28:02 2016
Errors in file /u01/app/oracle/diag/rdbms/ttrec/TTREC1/trace/TTREC1_arc1_1752.trc:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

I just checked the status of my redo log file -

[oracle@ehrsvpitoel02]:[TTREC1] $ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Mon Aug 29 18:28:17 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options

SQL> @redo

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 CURRENT
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 CURRENT
/data1/TTREC/onlinelog/o1_mf_2_cvf5qv2z_.log                                    2 ACTIVE
/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log                                    2 ACTIVE
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 ACTIVE
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 ACTIVE

6 rows selected.

Still the  views is  not giving any information regarding lost member(I should expect a INVALID status on v$logfile.status column. So I tried a database bounce as I want to see whether the column is reflecting properly after a bounce – this is not really required though.

SQL> shut immediate ;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size                  3712904 bytes
Variable Size            1996490872 bytes
Database Buffers          134217728 bytes
Redo Buffers               13062144 bytes
Database mounted.
Database opened.
SQL> @redo

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 CURRENT
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 CURRENT
/data1/TTREC/onlinelog/o1_mf_2_cvf5qv2z_.log                                    2 INACTIVE
/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log                                    2 INACTIVE         INVALID
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 INACTIVE
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 INACTIVE

6 rows selected.

SQL>

After restart, its clear that database is complaining about missed redo log file – I am not sure why this view is not populating at run time. Now I will drop and recreate the INVALID redo log file.

SQL> alter database drop logfile member '/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log';

Database altered.

Note:- You can drop and add online redo log file members while your database is in either a mounted state or an open state. If you attempt to drop a member of a current log group, Oracle will throw an ORA-01609 error specifying that the log is current and you cannot drop one of its members. If the failed member is in the current log group, then use the alter system switch logfile command to make the next group the current group.

SQL> @redo

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 CURRENT
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 CURRENT
/data1/TTREC/onlinelog/o1_mf_2_cvf5qv2z_.log                                    2 INACTIVE
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 INACTIVE
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 INACTIVE



SQL> alter database add logfile member  '/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log' to group 2;
alter database add logfile member  '/data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log' to group 2
*
ERROR at line 1:
ORA-01276: Cannot add file /data2/TTREC/onlinelog/o1_mf_2_cvf5qv5t_.log.  File has an Oracle Managed Files file name.

SQL>



SO if your redo logs are configured using OMF,  you have to drop the entire redo log group and then recreate it .

SQL> ALTER DATABASE DROP LOGFILE GROUP 2;

Database altered.

SQL>  ALTER DATABASE add LOGFILE GROUP 2;

Database altered.

SQL> @redo

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 CURRENT
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 CURRENT
/data1/TTREC/onlinelog/o1_mf_2_cw7yyz4t_.log                                    2 UNUSED
/data2/TTREC/onlinelog/o1_mf_2_cw7yyz8w_.log                                    2 UNUSED
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 INACTIVE
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 INACTIVE

6 rows selected.

Once the newly created member becomes part of the CURRENT log group, its status should change to NULL. A NULL member status indicates that the database is using the online redo log file.

SQL> alter system switch logfile ;

System altered.

SQL>
SQL> /

System altered.

SQL>

SQL> @redo

MEMBER                                                                     GROUP# STATUS           STATUS
---------------------------------------------------------------------- ---------- ---------------- -------
/data1/TTREC/onlinelog/o1_mf_1_cw7vjldp_.log                                    1 ACTIVE
/data2/TTREC/onlinelog/o1_mf_1_cw7vjljy_.log                                    1 ACTIVE
/data1/TTREC/onlinelog/o1_mf_2_cw7yyz4t_.log                                    2 ACTIVE
/data2/TTREC/onlinelog/o1_mf_2_cw7yyz8w_.log                                    2 ACTIVE
/data1/TTREC/onlinelog/o1_mf_3_cw7ogdtj_.log                                    3 CURRENT
/data2/TTREC/onlinelog/o1_mf_3_cw7ogdyv_.log                                    3 CURRENT

6 rows selected.

SQL>

Recovering After Losing One Member of the Multiplexed Group (in non-omf)

Search for ORA-00312 in your database alert log file that identifies which member of the log group is experiencing media failure.

Once you’ve identified the bad online redo log file, execute the following query to check whether that online redo log file’s group has a CURRENT status:

SQL> select  a.group#, a.thread#,  a.status grp_status, b.member member, b.status mem_status from v$log a, v$logfile b where a.group# = b.group# order by a.group#, b.member;

If the failed member is in the current log group, then use the alter system switch
logfile command to make the next group the current group. Then drop the failed member as follows:
SQL> alter database drop logfile member '<\directory\member>';

Then re-create the online redo log file member:
SQL> alter database add logfile member '<\new directory\member>' to group <group#>;

If an unused log file already happens to exist in the target location, you can use the reuse parameter to overwrite and reuse that log file. The log file must be the same size as the other log files in the group.
SQL> alter database add logfile member '\directory\member>' reuse to group <group#>;


No comments:

Post a Comment