This content has been marked as final. Show 5 replies
Integrating different operational sources is one of the most complex, but challenging, tasks!
First of all: NEVER run a full load on a production system.
Your assumption about duplicate rows when running an incremental load is semi-correct:
- YES you will have loaded all products twice
- NO it will not be (technical) duplicates because the datasource_num will differ
How you want to 'solve' this depends on the functional requirement.
You want to use the Product dimension (amongst others) as a common dimension to relate data loaded from eBS to data loaded from Siebel. Various RPD/ETL customizations are required.
Data loaded from eBS and data loaded from Siebel are two 'information SILOS' and there is no need to combine both sets. No ETL customization is required. RPD has to be customized.
Please inform us which scenario is valid for you and maybe we can help you out..
Thank you for all your explanation! It helped us a lot. We understand that we'll have same product duplicate with different datasource_num. Then I'll study a way to not duplicate them, following scenario 1, right?
We are planning follow the second scenario (lower effort), but I don't know if I understood correctly the 2nd scenario. On Dac, we will have 2 "SILOS" for Product to run (one for EBS and one for Siebel, on differents subject areas). Then, on RPD we will have filter them by datasource_num. Is that right?
Regarding the RPD:
I am not completely sure if you need to change the RPD..
If you can guarantee that users will always report on a combination of dimensions with facts, there is no need to filter on datasource_num. This is because all facts loaded from Siebel will be related to dimension values loaded from Siebel and all facts loaded from eBS will be related to dimension values loaded from eBS. At least, this is how it is supposed to work, please allocate enough time to test this really good.
However, 'just to be sure' you can create business model filters in the security groups in such a way that all 'Siebel-users' will only see the Siebel data and vice versa.
Regarding the DAC:
I think the most ideal way would be to combine all the to-be-loaded Subject Areas in one Execution Plan. This can be done if extraction from both eBS and Siebel can be performed in the same timeslot. This way, the staging tables will be populated from 1 or 2 sources, but the SILOS mappings will only have to be run once.
Regarding the most interesting stuff (scenario 1 ;) ):
You will require an unique identifier of a Product/other that is present in both source systems to be able to match them in the BAW. Please do not resort to literal comparissons on names etc. Once you have this identifier, you can modify the LKP operators in the fact mapplets and/or reusable LKP operators to use this identifier to retrieve the wid from the dimension, instead of the integration_id and datasource_num.
Edited by: m.siliakus on Feb 22, 2011 8:58 AM
We tried combing subject areas from EBS and Siebel and get the error:
MESSAGE:::The following tasks where present more than once belonging to different task groups:
EXCEPTION CLASS::: com.siebel.analytics.etl.execution.ExecutionPlanInitializationException