Use of Table Comparison Transform
Introduction:-
Table_Comparison transform helps to compare two data sets and generates the difference between them as a resultant data set with rows flagged as INSERT, UPDATE, or DELETE.
This transform can be used to ensure rows are not duplicated in a target table, or to compare the changed records of a data warehouse dimension table.
It helps to detect and forward all changes or the latest ones that have occurred since the last time the comparison table was updated.
We will be using this transform frequently while implementing slowing changing dimensions and while designing dataflow for recovery.
Scenario:-
We are doing a scenario where we are transferring a data from one database table to another.
Target table contains previously loaded data, at the source table 2 new records are added & other records are updated , now we want to insert the new records & update the existing records based on conditions.
1) Create project, job, workflow & dataflow as usual.
2) Drag a source table to dataflow. Its contents are as follows.
3)Drag a target table to dataflow. Its contents are as follows.
4) Drag a Query & Table_Comparison transform to dataflow & connect them as shown below.
5) Double click on Query & do the mapping.
6)Double click on Table_Comparison & provide the details as shown below.
- Table Name:- Select Target Table from the dropdown box.
- Generated Key Column:- Specify key column
- Select the "EMP_ID" node from the tree on LHS & drag into "Input primary key columns" list box. Now the comparison of the target table will take place based on whether the source EMP_ID is present in the target or not & comparison will be made based on the column s given under "Compare columns" list box.
- Similarly select the columns that are to be compared while transferring the data & drag it to "Compare Columns" list box.
- Select "Cached comparison table" radio button.
7) Save & Validate the job.
8) Execute the job & check the output.
- new entries 6 & 7 got added in the target table
- At 5 the row Country got changed from INDIA to IND.
- At 3 row LNAME got changed to Jone & CITY got changed to NewYork.
- In Source table at row 3 FNAME was changed to Janeee but it didn't get reflected in the target table as we have have kept LNAME, CITY & COUNTRY as the options for comparing the tables.
By this way you can use table comparison depending upon your requirement.