Reference: |
Change Data Capture loads data targets using only the rows that have changed since the last load. This approach is useful when dealing with large databases that would take too much time and resources to reload completely.
The source table used in this tutorial is an inventory table called dminv that is created by running Create Sample Procedures and Data. The source table is in DB2 or UDB.
The tutorial has the following steps:
The first step is to create a data flow that creates the target table and loads it with data from the current contents of the source table.
The user ID to run flows using CDC should have the role of Administrator on the DataMigrator server.
Note: Key columns are required for loading a table to uniquely identify each row so that Insert/Update/Delete processing can be performed.
A synonym for a log table allows reading from the database log records for a selected table. The synonym contains the same columns as the table itself plus three additional columns that identify changes to the table.
This synonym can be used as a source in a data flow with the special load type of IUD Processing. IUD is an abbreviation for Insert/Update/Delete.
Note: A synonym for a log table is not a substitute for a synonym for the table itself. A synonym is still needed for the table to be able to read or write from the table directly.
To create a synonym for a log table, in the DMC:
The Select Adapter window opens.
The Select Synonym Candidates window opens.
Optionally, check Filter by owner/schema check box and for object name enter dminv.
Under the Default synonym name, click DMINV and change the name to DMINV_LOG, as seen in the following image.
Tip: Another way to change these configuration settings to the synonym is as follows. Leave the variables for the parameters values as: &&CDC_START, &&CDC_CHKPT_SAVE, and &CDC_CHKPT_FILE. Then, expand the Variables folder, right-click a variable, and click Properties. Enter the DEFAULT value for the variable.
The window now looks like this:
In this example, we will create a checkpoint file that will be used by a CDC flow. The checkpoint file should be created on a z/OS platform in advance, prior to CDC flow runs.
-SET &&CDC_POLLING=1; -SET &&CDC_TIMEOUT=60; -SET &&CDC_START=CUR_TRAN; -SET &&CDC_CHKPT_SAVE=YES; -SET &&CDC_MAXLUWS=0; -SET &&CDC_COMMIT_MODE=OFF; -SET &&CDC_CHKPT_FILE='baseapp/mycheck1.chp' ; TABLE FILE LOG_MTB7 PRINT CDC_OPER FIELD2 FIELD3 END -RUN
SQL DB2 UPDATE MYETL.MYTABLE SET FIELD3 = FIELD3 +1 WHERE FIELD2 =2 ; END -RUN SQL DB2 COMMIT; END -RUN
In this step, we will create a DataMigrator direct load flow to copy data from the log table to the target table.
Note: Using a direct load flow with the checkpoint option ensures that all rows read from the log are written to the target table before the checkpoint file is updated. This ensures that the flow can be re-run from the original checkpoint if it cannot write to the target table (for example, if the database is not running).
While a data flow can also be used, it does not provide this assurance. The read and write operations are separate so that the checkpoint file is updated when the log is read.
The target properties page opens.
Click the X to close the properties page.
The Transformations window opens.
Click OK to close the window.
To test the Change Data Capture process, make changes to the dminv source table. In this example, three rows are inserted, one row is updated, and one is deleted.
For SQL Server, enter the line:
begin transaction;
insert into dminv values ('2001','Compact Flash',1000,50,25) ; insert into dminv values ('2002','Memory Stick',1000,80,40) ; insert into dminv values ('2003','Memory Stick Pro', 1000,200,100) ; update dminv set QTY_IN_STOCK=500 where prod_num = '2002' ; delete from dminv where prod_num = '2001' ;
commit transaction;
For Oracle or DB2, enter the command:
commit work;
Note that the CDC configuration saves a checkpoint of the last transaction processed. However, when you run a Sample Data report using the synonym for the log table, the checkpoint is not saved.
The following image shows a Sample Data report for the log.
Note that the statistics show the number of rows inserted, updated, and deleted.
Close the report.
Note: To return the dminv table back to its original state, enter the following lines in a stored procedure, where db is the engine used, and run them:
SQL db delete from dminv where prod_num > '2001' ; END SQL db commit; END
The final step in setting up Change Data Capture processing is scheduling the flow so that it runs periodically and processes transactions.
When you request sample data for a log table, a dialog box opens so that you can supply values to override the parameters in the synonym.
For example, you can sample data from a log table without creating a checkpoint by checking the check box in front of &&CDC_CHKPT_SAVE and entering the value NO.
You can also specify where to start reading the log by specifying a value for &&CDC_START. The values are the same as those specified in the synonym.
iWay Software |