SAP BO DATA Integrator / Data Services
Data services is integrated with SAP BI/SAP R3/SAP Applications and non SAP Ware house.
Purpose:- It does ETL via batch Job and online method through bulk and delta load processing of both structured and unstructured data to generate a Ware House (sap and Non-sap)
Data Services is the combination of Data Integrator and Data Quality. Previously these are separate tools like Data Integrator which is used to do the ETL part and Data Quality to do the data profiling and Data Cleansing.
Now with Data Services both DI and DQ are combined in to once interface so that it provides the complete solution (data integration and Quality) under one platform.
This even combines the separate job servers & Repositories of DI and DI in to one.
Data Federator: - The output of the data federator is the virtual data. Federator provides the data as input to the data services and using federator we can project data from multiple sources as a single source.
Data Services Scenarios:-
Source Ware House
SQL -- DS -- DB
Flat File -- DS -- DB
Flat File -- DS -- BI
R/3 -- DS -- BI
R/3 -- DS -- DB
SQL -- DS -- BI
We can move the data from any source to any target DB using Data Services.
Data Services is an utility to do ETL process, It is not a warehouse , so it doesn’t stage any amount of data in it.
Data Services can create ETL process and can create a ware house (SAP / Non-Sap) .
DS is used majorly for 3 sort of projects
1)
Migration
2) Ware house or DB building
3) Data Quality
Data Profiling: - Pre processing of data before the ETL to check the health of the data. By profiling we check the health of the data if it’s good or bad.
Advantages of Data Services over SAP BI/BW ETL process
It’s a GUI based frame work
It has multiple data sources in built configuration
It has numerous inbuilt Transformations (Integrator, Quality, Platform)
It does data profiling activity
It easily adds external systems
It supports Export Execution Command to load the data in to the ware house via batch mode process
It generates ABAP code automatically
It recognizes Structure and un structures data
It can generate a ware house (sap / Non Sap)
It supports huge data cleansing/ Consolidation/ Transformation
It can do real time data load/ Full data load/ Incremental Data load
Data integrator / Services Architecture
![intro1.png]()
No concept of Process chains/ DTP/ Info packages if you use the data services to load the data.
Data Integrator Components
Designer
![intro2.png]()
It Creates the ETL Process
It has wide set of transformations
It includes all the artifacts of the project ( Work Flow, Data Flow, Data Store, Tables)
It is a gate way to do profiling
All the designer objects are reusable
Management Console (URL based tool / Web based tool)
![intro3.png]()
It is used to configure the repositories
It allows us to configure user profiles to specific environment
It allows us to create users and user groups and assign the users to the user groups with privileges
It allows to auto schedule or execute the jobs
We can execute the jobs from any Proj-geographic location as this is a web based tool
It allows us to connect the repositories to Connections (Dev/ Qual / Prod)
It allows us to customize the data stores
Access Server
It is used to run the real time jobs
It gets the XML input (real time data)
XML inputs can be loaded to the Ware house using the Access server
It is responsible for the execution of online / real time jobs
Repository Manager
![intro4.png]()
It allows us to create the Repositories (Local, Central, and Profiler)
Repositories are created on top of the standard database
Data Services system tables are available here
Job Server
This is the server which is responsible to execute the jobs. Without assigning the local / central repository we cannot execute the job.
Data Integrator Objects
Projects :-
Project is a folder where you store all the related jobs at once place. We can call it as a Folder to organize jobs.
Jobs:-
Jobs are the executable part of the Data Services. This job is present under the project.
Batch Job
Online jobs
Work Flows:-
This work flow acts a folder to contain the related Data Flows. This Work Flows are re-usable
Conditionals:-
Conditional contains Work Flows or data flows and these are controlled by script whether to trigger or not.
Scripts:-
Scripts are set of codes used to define or initialize the global variables, Control the flow of conditionals or control the flow of execution , to print some statements at the runtime and also to assign specific default values to the variables.
Data Flow:-
The actual data processing happens here.
Source Data Store:-
It is the place held to import the data from the data base/ sap to data services local repository
Target Data Store:-
It is the collection of dimensions and fact tables to create the data ware house.
Transformations:-
These are the query transformations that are used to carry out the ETL process. These are broadly categorized in to 3 (platform, Quality and integrator)
File Format :-
It contains various legacy system file formats
Variables:-
We can create and use the local and global variables and use them in the project. The variables starts with “$” Symbol.
Functions:-
We have numerous inbuilt functions like (String, math, lookup , enrich and so on)
Template Table:-
These are the temporary tables that are used to hold the intermediate data or the final data.
Data Store:-
These data stores acts a port from which you can define the connections to the source or the target systems. You can create multiple configurations in one data store to connect this to the different systems
ATL :-
ATL files are like the BIAR files. This is named after a company. ATL doesn’t hold any full form like BIAR.
The Project/ Job/ Work Flow/ Data Flow/ Tables can be exported to ATL so that they can be moved between Dev -->Qual and from Qual-->Prod.
Similarly you can also import the Project/ Job/ Work Flow/ Data Flow/ Tables which are exported to ATL, back in to the data services
Thanks
Rakesh ![]()