Quantcast
Channel: Data Services and Data Quality
Viewing all articles
Browse latest Browse all 236

How to use Validate Transform

$
0
0

Introduction:-

 

 

Validation transform is used to filter or replace the source dataset based on criteria or validation rules to produce desired output dataset.

It enables to create validation rules on the input dataset, and generate the output based on whether they have passed or failed the validation
condition.


In this Scenario we are validating the data from the database table with correct format of the zip code.

If the zip code is less than 5 digit then we will filter that data & pass it to another table.

 

The Validation transform can generate three output dataset Pass, Fail, and RuleViolation.


  1. The Pass Output schema is identical with the Input schema.
  2. The Fail Output schema has 3 more columns, DI_ERRORACTION and DI_ERRORCOLUMNS, DI_ROWID.
  3. The RuleViolation has three columns DI_ROWID, DI_RULENAME and DI_COLUMNNAME.


Steps:-


1) Create project, job, workflow, dataflow as usual.


2) Drag source table, Validate transform& provide details.


1.png


  • Double click on Validation transform to provide details. You can see the 3 types of dataset as described above.

 

2.png

 

  • Add a validation rule.

 

3.png

 

  • Click Add & fill the details about the rule as follows.

 

4.png

 

Action on Fail:-

                1) Send to Fail:-  on failure of the rule the record will sent to another target with "Fail" records.

 

                2) Send to Pass:- even on failure pass the record to the normal target

 

                3) Send to Both:- sends to both the targets.

 

Column Validation:-

                Select the column to be validated, then decide the condition.

 

                We have selected "Match Pattern"as the condition  pattern as '99999'.

 

                So it will check whether Zip code is of 5 digits or not.

 

  • Press OK. Then you can see the entry get added as follows.

 

5.png

 

3) Add a Target table to the dataflow & link the Validate Transform to it.

 

1.png

 

  • Choose the validate condition as "Pass"

 

2.png

 

  • Similarly do the connection for "Fail" & "Rule Violation" condition.

 

3.png

 

4) Validate the job & execute it.

 

5) Check the input & output.

 

  • Input:-

 

4.png

 

  • You can see in the input in the above figure where the last row has zip code of less than 5 digits. Now view the output.

 

  • Output for Pass condition:-

 

5.png

 

  • Output for Fail condition

 

6.png

 

     You can see that the invalid record from input is transferred to the  "CUST_Fail" table as shown above.

     Three more columns "DI_ERRORACTION", "DI_ERRORCOLUMNS", "DI_ROWID" can also be seen.

 

  • Output of the "RuleViolation" condition.

7.png

 

Summary:-

 

So in this way Validate transform is useful in validating the records based on the rules & categorising the bad records into different target which can be analysed later.

 

 

Thanks & Regards,

 

Rahul More

 

(Project Lead)

1.jpg


Viewing all articles
Browse latest Browse all 236

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>