Monday, May 15, 2017

Why supplemental logging is required for golden gate ?


There are two level of supplemental logging mentioned in oracle documentation

1. Database level supplemental logging
2. Table level supplemental logging

Why supplemental logging is required for golden gate?

One of the main reason to enable supplemental logging is to generate logs 
for update statement. As insert and delete involves change to whole row (either addition or deletion of complete row) and all columns change in it where as update can happen on a column of a row. Supplemental logging make sure that enough information is captured during this update that can be used by any method based on logminer technology.

Other reasons could be to capture chained rows etc.

MINIMUM LEVEL OF SUPPLEMENTAL LOGGING

Minimum level of supplemental logging at database level is required for golden and can be achieved by

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA

By default, 
Oracle only logs changed columns for update operations. Normally,this means that primary key columns are not logged during an update operation.However, Replicat requires the primary key columns in order to apply the update on the target system. The ADD TRANDATA command in GGSCI is used to cause Oracle to log primary key columns for all updates

For Golden gate replication minimum level of supplemental logging require is at primary key level to uniquely identify each row for replication purpose.

Following Golden gate command

ADD TRANDATA scott.DEPT 

Actually enable supplemental logging at table level on primary key and running this command in the background
SQL ALTER TABLE "SCOTT"."DEPT" ADD SUPPLEMENTAL LOG GROUP "GGS_DEPT_1668166" ("DEPT_ID") ALWAYS 

ADD TRANDATA < table name > automatically adds supplemental logging for the table using the table's primary key or if that's missing, using any unique key constraints defined for that table.
* minimum level of supplemental logging at database level is required before enabling supplemental logging at table level

* Golden gate require minimum primary/unique key supplemental logging which is expensive if enabled at database level when only one schema or few tables are configured for replication. 

Hence basic supplemental logging enable at database level and specific primary/unique key level on table level via Golden gate.

If you enable only table level supplemental logging without database level then oracle will not capture all changes.

What will happen if table does'nt have any primary/unique key constraint?

If the table does not have either a primary key or a unique key constraint defined ( there are applications out there that control uniqueness entirely within the application logic and the tables are not constrained in any way at the database level) , if this is the case then ADD TRANDATA will automatically add supplemental logging using every column in the table.

This is not nice as it will increase the size of the redo logs and the trail files unnecessarily.  More importantly if you are applying a change on the target database using all the columns as the key then the DML ( be it an update or a delete operation) will not be efficient and will cause your REPLICAT to lag behind.

So the idea is to find out from the application owner what the "theoretical" key for that table is.  Once you know what columns are used by the application to define uniqueness, then you can define those keys in the parameter files using 
"KEYCOLS".

As for ADD TRANDATA, then you can use the following syntax to ensure only those columns are used to add supplemental logging.

ADD TRANDATA <table name>, COLS (list of column names), NOKEY

The important bit here is the "NOKEY" , usually if you don't use "NOKEY" and you try using only "COLS" then you will get an error saying that the columns you define in "COLS" are already used because by default ADD TRANDATA adds supplemental logging for all the columns if there is no primary key defined as stated earlier.

One more thing worth noting here, you want to make sure the target database has indexes defined on the columns in question.