I am new to CDC in ODI. Please help me on this.
Actual Requirement: Integrate the data from 3 different source system and compose a 'Sales Account' record in Fusion.CDC feature of ODI to be used for incremental updates.
Information: The three source systems are loosely coupled. Each source system is either joined or not joined with the other source systems based on the business units under processing.
Mock Example: Lets take a favorite example to better explain the actual requirement.
EMPLOYEE and DEPARTMENT table.
The organization has a requirement to migrate EMPLOYEE and DEPARTMENT from the source system and create a flat table combining EMPLOYEE and DEPARTMENT data in the target system. (Lets assume the EMPLOYEE and DEPARTMENT table are maintained in two different source system).
1. If the Employee is of type 'BOD'(assume it as business unit) then Employee may not have corresponding link in DEPARTMENT table.
2. If the Employee is not of type 'BOD' then Employee will have corresponding link in DEPARTMENT table.
No Key constraints or referential integrity is maintained.
Please advise me, which CDC to use, Simple or Consistent.
Thanks in advance!!!!
I recommand you to use consistant CDC.
In you example, you will have 1 interface that use the EMPLOYEE journal, will left outer join on DEPARTMENT table.
If there is new/modified records in the table employee, the CDC will capture them and your interface will load these data into your flat table. With or without department, depending on the employee type.
Then, another interface that use the DEPARTMENT journal, joined with employee table.
If there is new/modified records in the table department, the CDC will capture them and load them into your flat table.
The 2 interfaces are quite identical. But one of them use the journal of employee, and the other one the journal of department.
Thanks for the clarification.
Could you please help me with this:
Our client advised not to use trigger based CDC mechanism as it is expensive. So we do not have choice, have to go ahead with the LOGMINER / Data Streams. But, LogMiner and Data Streams are available only for consistent set and not for simple set.
But we have a requirement, where there are multiple source systems. Lets say source system A, B and C having table A1, B1 and C1 respectively. For certain cases, i have to look into table A1 and and if it has data, i have to pull additional information from B1 and C1 and push it to target where as for some other cases, i have to start looking into table B1 and if it has data, i have to pull additional information from A1 and C1. Same holds true for certain business team where I have to start from Source System C.
So, please advise me how can I create a consistent set as the order of the tables changes with the change in business unit being involved.
I assume that since A, B and C are differents "system" source, then A1, B1 and C1 are on different database, and different physical schemas. We call them As, Bs, Cs.
You can set the CDC in your 3 systems. It will create three "CDC Set" : each "CDC Set" refers to 1 schema : As, or Bs, or Cs.
By activating CDC on these schema, you will generate 1 journal for each table you want to journalize.
J$_A1 ; J$_A2 and J$_A3
In ODI, you must create 3 interfaces : first interface :
use the journalisation on A1, and link this journalized datastore to B1 and C1 to add additional information.
==> if there is new data on A1, these new/modified data will be extracted and loaded in your target with B1 and C1 additional attributes second interface :
use the journalisation on B1, and link this journalized datastore to A1 and C1 to add additional information.
==> if there is new data on B1, these new/modified data will be extracted and loaded in your target with A1 and C1 additional attributes third interface :
use the journalisation on C1, and link this journalized datastore to A1 and B1 to add additional information.
==> if there is new data on C1, these new/modified data will be extracted and loaded in your target with B1 and A1 additional attributes
In your package, you will have to extend the three CDC sets.
Once again thanks for the clarification.
I understand the no. of interfaces will be equivalent to no. of tables involved in the interface at the source side if CDC is enabled in all the tables irrespective of source system data stores lies in the same schema or different schema.
Please confirm my understanding.
but only the tables where new data or modified data are relevant.
You can have some tables that you only want to join to get an additional info, but you don't want to load your entire data if one row of this table is new or modified.