1 person found this helpful
The answer is "it depends".
If source and target databases are in the same database instance, and you set thing up correctly, data will not be on the network.
If source and target databases are in different instances on the same host, you can set things up so the ETL uses a DB link, which can be configured (along with the host's networking config) such that data never leaves the host (and therefore never gets on the network).
In general though, data flows from the Source to the ODI Agent, then from the ODI Agent to the Target. So in the scenario you describe above, data will be exposed on the network when it goes from the Source to the Agent ("ODI server", as you say), and then again when it goes from the Agent back to the Target.
However, you can install another ODI agent on the host where your databases are installed, and in this case (again, depending on the hosts network config and how you configure topology) data does not have to be exposed on the network, since it doesn't have to hit the network to go from Source to Agent, and Agent to Target.
Thanks for your reply. It was really helpful.
An agent is a Java process that's usually located on the server and listens to a port for incoming requests. It runs the requested scenario,
reverse-engineers requested datastores, etc.
When a job submitted through ODI Studio GUI or through the startscen.sh agent gets scenario from the work repository and topology definitions
from master repository, it combines and converts them into a runnable job, usually consisting of more than one code block. Then, it sends code blocks to destination environments, which may be DB servers, file servers, Hadoop name nodes, etc. Finally,
the agent gets job statuses from these environments and writes into work repository tables for us to see from the Operator tab of ODI Studio.
Now coming to my scenario, my source and target data bases are 2 different database instances (2 pluggable data bases -PDBs, under same container database -CDB) in same server let's say server A. Metadata of ODI (Master and work repository) is hosted by my target database instance. My ODI software is installed in different server , let's say server B. I am doing ELT process than ETL.
Please advice if below understanding is correct:
1) My data will not move from server A to server B where ODI agent is located.
2) Only Metadata and statuses will move between server A and Server B. My data movement will be within same host/node/server and not being exposed over network.
Unfortunately, the answer is still "it depends". But without any further information, I would say that both statements are incorrect. Data will move from Server A (source DB) to Server B (ODI Agent) and back to Server A (target DB), unless you set up a DB link in the Target DB to pull from the Source DB, and you configure ODI to use this DB link.
A way to think about it is, in general, "ODI Agent asks Source DB for data. Source DB sends ODI Agent the data. ODI Agent sends the data over to Target DB." It is not "ODI Agent asks Source DB to send Target DB the data."
However, if there is a DB link on the Target DB that points to the Source DB, and you have configured ODI to use the DB link, ODI Agent can ask the Target DB to pull data across the DB link from the Source DB. In this case, you can configure the DB link and/or the network interface on Server A (where both DBs live) so data does not leave Server A.
There may be something special for PDBs that live in the same CDB (other than a standard DB link), but in general, if ODI sees different physical data servers, data will flow through the agent, even in "ELT".
Thanks for your help, really helpful.
May I get your advice on below:
My ODI load ran for 6 hours , invoking multiple mappings, packages and scenarios.
However CPU utilization % is 3 to 5 %.
There is no other application , process running in server apart from ODI Load run.
I want to improve CPU utilization to 80%.
Please advice what all possible things can be done to achieve it.