HELp to reslove this ETL problem.
Scenario: As part of initial setup of a DW, We have to extract 3 years of data (history) from a source system and load onto Data warehouse fact table.
On average each year has 1.2 Crore rows with each row size being 1024 bytes.
2 months of data processing has following statistics:
- 4 hours to extract from source system
- 30 min for data transfer from source to a system to target system. (file copy)
Assumption:
· You have the latest hardware with multiple processors, parallelism (4) on both source (Mainframe system) and the Data Warehouse (UNIX) systems.