1)
To understand how the DAC handles extract and load
processes, you must have an understanding of the DAC repository objects.
2)
For a description of these objects, see "About DAC Repository Objects Held in Source System
Containers".
3)
When you run an execution plan, data is extracted
from one or more tables in the source system database, dumped into staging
tables, and then loaded into tables in the data warehouse.
4)
Each task in the DAC is mapped to a full load or an
incremental load workflow in Informatica.
5)
The DAC uses refresh dates (indicated in the
Refresh Dates child tab of
the Physical Data
Sources tab) to determine whether to invoke a full
or incremental
workflow.
6)
If the source or target table for a task is null,
then the DAC invokes a full load workflow (command).
7)
If both the source and target tables have a refresh
date, then the DAC invokes the incremental workflow (command).
For a detailed
description of refresh dates, see "About
Refresh Dates".
The DAC supports the
following extract and load combinations:
Full extract and full load
1) This extract and load combination is used for the very first
extract and load.
2) All data is extracted from the source system and loaded into the
data warehouse.
3) The DAC performs a full extract for a task if the source and
staging tables have null refresh dates.
4) The DAC performs a full load for a task if the staging and target tables
have null refresh dates.
Full extract and incremental load
1)
This extract and load combination loads existing
data warehouse tables with data from new sources.
2)
Data is extracted from the source system through a
full extract
command.
3) When the source or staging table is null, the DAC invokes the full
extract workflow.
4) Data is loaded from the
staging table into the target table through an incremental load command.
5) When the staging and target tables have refresh dates, the DAC
invokes an incremental load command.
6) This situation arises when data is loaded into an existing data
warehouse from a new source connection.
7) The incremental load process requires additional logic to
determine whether a record should be inserted or updated.
8) Therefore, if you add a new source connection to populate an
existing data warehouse, you should expect the incremental load to be slower
than when running a full load.
Incremental extract
and incremental load
1) This extract and load combination is used for regular nightly or
weekly ETL processes.
2) New or changed records are extracted from the source system and
Loaded into the
data warehouse.
3) The DAC performs an incremental extract for a task if the source
and staging tables have refresh dates and performs an incremental load for a
task if the staging and target table have refresh dates.
No comments:
Post a Comment