Saturday, March 26, 2016

Steps involved in node addition in oracle RAC

At present :
Two node RAC with nodes as (sa-aplha and sa-beta) Hosting a database MYDB with instance as (MYDB1 and MYDB2)
Aim:
Add new node sa-gamma to this two node cluster and to create MYDB3 on the new node sa-gama.
Prerequisite:-
 
1. OS certification , compatibility and patchset levels should match with other nodes in the cluster  
2. All the packages required by Oracle should  be installed on new node. 
3. If you are using ASMLIB, appropriate binaries should be installed on new node , 
4. OS limits should match with other nodes  (/etc/security/limit.conf) 
5. OS kernel parameter in place with other nodes (/etc/sysctl.conf) 
6. DNS configuration should be match with other nodes -                  
    Check the contents of /etc/resolve.conf  and compare with other nodes .
    if you don't  have a DNS for name resolution , make sure you have similar /etc/hosts file across all  nodes . 
7. Configure auto-SSH connectivity for grid user among all 3 nodes  
8. Time should be synchronized across all nodes in the cluster , if there is a problem  try 
    for  ntpd restart on each node. From oracle 11g onwards we can use CTSS 
    instead of  relying on operating system NTP. So if your cluster is configured  
    with  CTSS  , you need to de-configure ntpd on the new node
    #service ntpd stop
    #chkconfig ntpd off
    #mv /etc/ntp.conf /etc/ntp.conf.original
    #rm /var/run/ntpd.pid 
9.Create Job Role Separation Operating System Privileges Groups, Users, and  Directories.
    While creating the new users make sure  UID and the GID of oracle/grid are 
    identical to   that of the other RAC nodes.
    where oracle --> rdbms software owner
    grid --> grid infrastructure onwer  
10.Configure Network Components:-
     While configuring Network components make sure your public/private NIC must 
     have same name compared to other nodes in the cluster .
     For example , in first and second node if eth0 is configured for public network 
     you have  to choose the same NIC for configuring public network for the new node.
     Note:- Public n/w is a rout-able network -- you have to give the default 
     gateway  address while configuring the NIC for public n/w, but private n/w should
     not be a rout-able n/w  that is we don't configure default gateway for private n/w.
             
11.Configure Access to the Shared Storage:-
       Either contact your system administrator to make this task done , or try to accomplish it by yourself -
       if you are using iscsi (NAS) , after configuring the storage part do the following from the new node:-
                As root execute following commnads -->
                 service iscsci restart
                 lsscsi - should list the all available LUN
                 /etc/init.d/oracleasm scandisks
                 /etc/init.d/oracleasm listdisks
                 it should list all ASM labeled disks
 
Verify New Node using cluvfy 
Use CVU as the Oracle Grid Infrastructure owner  to determine the integrity of the cluster and whether it is ready for the new Oracle RAC node to be added.
From the GRID_HOME as grid owner execute the following                  
$cluvfy stage -pre nodeadd -n sa-gamma -verbose 
You can ignore the PRVF-5449 message if you have already configured and verified that the ASM shared disks are visible on the third node. The error was a result of having the voting disks stored in Oracle ASM storage, a new feature of Oracle 11g Release2.
if you have any ignorable error previously do the following before actually involking the addNode.sh script 
export IGNORE_PREADDNODE_CHECKS=Y 
Extend the GRID_HOME to the new node 
$export IGNORE_PREADDNODE_CHECKS=Y
$cd $GRID_HOME/oui/bin   
$ ./addNode.sh -silent "CLUSTER_NEW_NODES={sa-gamma}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={sa-gamma-vip}"
Basically it will copy & configure your GRID software to your remote node and end up up with asking to execute following script as root  -- > orainstRoot.sh  and root.sh 
where root.sh is the main script that will create olr,it will update OCR , start all the clusterware doemon in third node along with ASM .  

#$GRID_HOME/root.sh
Running Oracle 11g root script...
The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/11.2.0.2/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...  
Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11.2.0.2/grid/crs/install/crsconfig            _params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful. OLR initialization - successful
Adding daemon to inittab
ACFS-9200: Supported
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9307: Installing requested ADVM/ACFS software.

ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9309: ADVM/ACFS installation correctness verified.  

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node ayu2, number 2, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster

clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded  
Here you can see , the new node is getting joined  to cluster .

Post verification :-
$cluvy stage -post nodeadd -n sa-gamma
------------------------------------------------------------------------------------------
Extend the RDBMS ORACLE_HOME to inlude the new node
From an existing node – As the database software owner – run the following command to extend the Oracle database software to the new node "sa-gama"
As oracle user 
$cd $ORACLE_HOME/oui/bin    
$./addNode.sh -silent "CLUSTER_NEW_NODES={sa-gamma}" 

Note: remember this time addNode.sh is invoked from ORACLE_HOME/oui/bin .
It will copy and configure RDBMS software on  the remote node and end up with asking to execute following script as root
    
 #$ORACLE_HOME/root.sh
   
 Add New Instance to Clustered Database--  
Invoke DBCA from any running instance - select instance management -then choose  the following .
Welcome: Select Oracle Real Application Clusters (RAC) database as the database type that you would like to create or administer.
Operations: Select Instance Management.
Instance Management: Select Add an instance.
List of cluster databases: Enter the sys username and password.
List of cluster database instances: Verify the existing database instances.
Instance naming and node selection: Verify the instance name (MYDB3) to be added and the node (sa-gama) on which to add the instance.
Instance Storage: Verify the storage configuration.
Summary: Verify the set up and click on OK. 
  Basically it will do following task for you - 
  -> add instance specific parameter to the spfile ,
  -> add undotablespace for the third instance
  -> add redo log gruop  for thread 3
  -> update the metadata of new instance in OCR:- 
 
if you want to do it manually execute the commands from any of the running nodes
   
SQL> alter database add logfile thread 3 group 5 ('+DATADG','+FRADG') size 50M, group 6 ('+DATADG','+FRADG') size 50M; 
 
Database altered. 
SQL> alter system set thread=3 scope=spfile sid='MYDB3'; 
System altered. 
SQL> alter database enable public thread 3; 
 
Database altered. 
SQL> create undo tablespace undotbs3 datafile '+DATADG' size 200M autoextend on; 
 
Tablespace created. 
  SQL> alter system set undo_tablespace='undotbs3' scope=spfile sid='MYDB3'; 
 
System altered. 
 
SQL> alter system set instance_number=3 scope=spfile sid='MYDB3'; 
 
System altered. 
 
SQL> alter system set cluster_database_instances=3 scope=spfile sid='*'; 
 
System altered.    


Update Oracle Cluster Registry (OCR)
The OCR will be updated to account for a new instance – "MYDB3" – being added to the "MYDB" cluster database . Add "MYDB3" instance to the "MYDB" database and verify
#srvctl config database -d  MYDB  
             
 Reference:-  https://www.youtube.com/watch?v=ZV_P5-qpLhs

Saturday, March 12, 2016

Things to know about Block media recovery in Oracle

Block Media Recovery
Oracle9i introduces the concept of block media recovery (BMR) to the recovery process.
You can use block media recovery to recover one or more corrupted data blocks within data files.If you’re using Oracle Database 10g or Oracle9i Database, use the BLOCKRECOVER command to perform block media recovery. In 11g instead of BLOCKRECOVER command we can simply use RECOVER command.

To perform BMR, the database must be open or mounted and in archive log mode and must have a current, usable control file. The database must not be a standby database.You must use level 0 or full backups for the restore. 
All of the required archived redo logs must be available for the recovery process.If you have enabled Flashback Database then RMAN will search the flashback logs for uncorrupted versions of the required blocks (from 11g onwards) . So if your blocks are available in the flashback logs RMAN won't look for any backups and it recover the corrupt block straight from flashback logs and it is probably faster.

Identification of Corrupt Blocks
The V$DATABASE_BLOCK_CORRUPTION view displays blocks marked corrupt by database components such as RMAN, ANALYZE, dbv, and SQL queries. So any SQL or RMAN section that encounter a corrupted block will update the V$DATABASE_BLOCK_CORRUPTION. The following types of corruption result in rows added to this view:

Physical corruption (sometimes called media corruption)
The database does not recognize the block: the checksum is invalid, the block contains all zeros, or the block header is corrupt.BMR can completeley recover physical corruption happened on  a block . 

Logical corruption
The block has a valid checksum, the header and footer match, and so on, but the contents are logically inconsistent. Block media recovery may not be able to repair all logical block corruptions. In these cases, alternate recovery methods, such as tablespace point-in-time recovery, or dropping and re-creating the affected objects, may repair the corruption.

If you are trying to take a RMAN database backup and your database have some corrupted block then your backup will fail. RMAN will stops at the first corrupt block, logging just that single one. To get the completed list of corrupt blocks you should issue a backup validate command: 

So to check the datafiles for any physical or logical corruption and verify whether a datafile is in the correct directory, use the backup validate command.By default, RMAN checks for physical corruption.  By using check logical syntax, we can check for logical corruption as well.

To validate the tablespace, use the following command:  

RMAN> backup validate tablespace USERS;

The above command is doing nothing but just validating USERS tablespace for backup operation., if it found any corrupt blocks,it will update v$database_block_corruption

To validate a specific datafile, use the following command:

RMAN> backup validate datafile 8;

To check the whole database along with archivelog use the following command:

RMAN> backup validate database archivelog all;

The above commands will look only for physical corruption.Add the CHECK LOGICAL clause to include checks for logical corruption.

RMAN> backup validate check logical database archivelog all;

Note:- From 11g onwards instead of backup validate we can use simple VALIDATE command .
RMAN> validate datafile 8;

Any block corruptions are visible in the V$DATABASE_BLOCK_CORRUPTION view. You can identify the objects containing a corrupt block using a query like this.
COLUMN owner FORMAT A20
COLUMN segment_name FORMAT A30

SELECT DISTINCT owner, segment_name
FROM   v$database_block_corruption dbc
JOIN dba_extents e ON dbc.file# = e.file_id AND dbc.block# BETWEEN e.block_id and e.block_id+e.blocks-1
ORDER BY 1,2;

Recover specific data block
rman> recover datafile 8 block 13;
rman> recover datafile 8 block 13 datafile 10 block 12;

Recovering All Blocks in V$DATABASE_BLOCK_CORRUPTION
RECOVER CORRUPTION LIST command will recover all the blocks that are marked corrupt .
rman> recover corruption list;
from following output it could see that RMAN recoverd the corrupt block from flashback logs . 
RMAN> blockrecover corruption list;
Starting recover at 16-NOV-10
using channel ORA_DISK_1
searching flashback logs for block images
finished flashback log search, restored 1 blocks
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 16-NOV-10
Automatic Block Corruption Repair ( in 11g Active data guard )
In addition to the real time query capability of the 11g Active Data Guard feature, we can also add to our high availability capability by using the Automatic Block Media(ABMR) Repair feature whereby data block corruptions on the Primary database can be repaired by obtaining those blocks from the standby site – all performed by a background process (ABMR) transparent to the application. To learn more read below link 

Basic concept:-
At the first encounter of corruption oracle marks a block as media corrupt and it won't keep that block in buffer cache anymore and oracle write that corrupted block into disk .No subsequent read of the block is successful until the block is recovered.You perform block media recovery with the RECOVER ... BLOCK command. By default, RMAN searches the flashback logs for good copies of the blocks, and then searches for the blocks in full or level 0 incremental backups. When RMAN finds good copies, it restores them and performs media recovery on the blocks

Advantages:

  1. With BMR, you can restore individual data blocks from your RMAN backups, and recover those blocks to the point of the failure. During the block recovery process, Oracle will continue to allow access to all blocks that are not corrupted.The only user impact will be to those users who want to access the corrupt blocks, and they will have been impacted anyway.
  2. Enables affected data files to remain online during recovery , that is no need to take the affected tabledpace/datafile offline .
  3. Lowers the mean time to recover (MTTR) because only blocks needing recovery are restored and recovered
  4. Without block media recovery, if even a single block is corrupt, then you must take the data file offline and restore a backup of the data file. You must apply all redo generated for the data file after the backup was created. The entire file is unavailable until media recovery completes. With block media recovery, only the blocks actually being recovered are unavailable during the recovery.
  5. Block media recovery is most useful for physical corruption problems that involve a small, well-known number of blocks.Block media recovery is not intended for cases where the extent of data loss or corruption is unknown and the entire data file requires recovery. In such cases, data file media recovery is the best solution.


Wednesday, March 2, 2016

How to check the disk utilization where ASMLIB is configured


Get the list all physical disk that are mapped to ASM disk by using following shell script,

[root@todbex01 mp]# cat asm_to_dik.sh#!/bin/bash
for asmlibdisk in `ls /dev/oracleasm/disks/*`
  do
    echo "ASMLIB disk name: $asmlibdisk"
    asmdisk=`kfed read $asmlibdisk | grep dskname | tr -s ' '| cut -f2 -d' '`
    echo "ASM disk name: $asmdisk"
    majorminor=`ls -l $asmlibdisk | tr -s ' ' | cut -f5,6 -d' '`
    device=`ls -l /dev | tr -s ' ' | grep -w "$majorminor" | cut -f10 -d' '`
    echo "Device path: /dev/$device"
  done
[root@todbex01 mp]# sh asm_to_dik.sh
ASMLIB disk name: /dev/oracleasm/disks/COR01
ASM disk name: OCR_VOTE_0000
Device path: /dev/dm-40
ASMLIB disk name: /dev/oracleasm/disks/COR02
ASM disk name: OCR_VOTE_0001
Device path: /dev/dm-37
ASMLIB disk name: /dev/oracleasm/disks/COR03
ASM disk name: OCR_VOTE_0002
Device path: /dev/dm-34
ASMLIB disk name: /dev/oracleasm/disks/DATA01
ASM disk name: ORADATA_0000
Device path: /dev/dm-24

You can get the same result by using oracleasm utility too,

[root@todbex01 mp]# /usr/sbin/oracleasm querydisk -v DATA01
Disk "DATA01" is a valid ASM disk
[root@todbex01 mp]# /usr/sbin/oracleasm querydisk -v -d DATA01
Disk "DATA01" is a valid ASM disk on device [252,24]
[root@todbex01 mp]# ls -l /dev | grep 252,|grep 24
brw-rw----  1 root root     252,  24 May 26  2015 dm-24
[root@todbex01 mp]#

Once you find out the physical disk associated with your diskgroup you can make use of iostat to get the I/O statistics for that particular disk .

Some information about iostat:-

iostat - Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions.

Commonly used option:-
-c     The -c option is exclusive of the -d option and displays only the CPU usage report.
        that is iostat option -c, displays only the CPU usage statistics as shown below.
-d     The -d option is exclusive of the -c option and displays only the device utilization report.
-k    Display statistics in kilobytes per second instead of blocks per second. 
-m   Display statistics in megabyte per second instead of blocks per second.  
-n    Displays only the device and NFS statistics.
-x     Display  extended statistics.

Eg,
       iostat -d 2
              Display a continuous device report at two second intervals.(until you press Ctl-C).
       iostat -d 2 6
              Display six reports at two second intervals for all devices.

       iostat -x hda hdb 2 6
              Display six reports of extended statistics at two second intervals for devices hda and hdb.

       iostat -p sda 2 6
              Display six reports at two second intervals for device sda and all its partitions (sda1, etc.)

   
[root@todbex01 mp]#iostat -dkx /dev/dm-24
Linux 2.6.39-400.214.4.el5uek (todbex01)       03/02/2016

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
dm-24             0.00     0.00 174.69  9.00 16721.00    41.19   182.50     0.16    0.88   0.12   2.20

The main figure that we have consider is the %util field,

%util: When this figure is consistently approaching above 80% you will need to take any of the following actions -

    increasing RAM so dependence on disk reduces
    increasing RAID controller cache so disk dependence decreases
    increasing number of disks so disk throughput increases (more spindles working parallely)

[root@todbex01 mp]#

To get the input/output statistics for all disks at two second intervals use the following command

[root@todbex01 mp]#iostat -dkx 2