4 Replies Latest reply on Jul 12, 2019 5:44 AM by 3988035

    ODI: DB Linked Based Load Strategy: Data Security and Data Governance

    3988035

      Hi All,

       

      Can you please advise on data security risk associated with DB link based data load strategy (if any) in ODI.

      What is advantages and disadvantages of DB link based load strategy with respect to other load strategy - agent based .

      I read in one doc available in internet that  DB link based load strategy is  good for reasonably huge volume of data, any idea why it is better for huge volume of data than comare to agent based load strategy.

      Do you see any data governance (data security) issue and other kind of issue if DB link based load strategy is followed in ODI and solution/product is on cloud.

       

      Thanks,

      Rajneesh

        • 1. Re: ODI: DB Linked Based Load Strategy: Data Security and Data Governance
          Rodrigo Radtke Souza

          Not sure about security reasons on having a DB Link (I never had a problem with that). What I may comment is on the reason why DB Link is generally prefered over Agent load.

           

          Agent load is a two steps operation that uses JDBC: ODI agent needs to fetch data from source DB, go over the network, bring the records into agent memory and then send it over the network again to the target DB. Your ODI agent may be in one of the source/target databases servers (which would minimize the network effect), but it is a two step operation nonetheless (passing data throught the agent memory).

           

          DB Link load is a more straight forward approach since ODI will only be responsible to call the selects/inserts and the data will be transfered from source to target DBs without any ODI agent interference. Generally this approach is faster.

           

          Thanks

          • 2. Re: ODI: DB Linked Based Load Strategy: Data Security and Data Governance
            3988035

            Thanks Rodrigo, really helpful!!

             

            Here agent memory means RAM?

             

            I was wondering about scenario, if ODI agent is installed in source DB or target DB server and both (source and target PDBs) are hosted by same node/server and hence data is not exposed over network, in that case data movement will be 2 step process?

             

            From source pdb to agent memory and then again agent memory to target pdb?

            In this scenario - is there a separate memory(RAM) allocated to ODI agent?

             

            I was thinking that all 3 (Source PDB , ODI agent and Target PDB) will share same memory -system's RAM on which all 3 are located.

             

            Not sure how internally memory is being allocated when all are in same machine/node/server.

             

            Thanks,

            Rajneesh

            • 3. Re: ODI: DB Linked Based Load Strategy: Data Security and Data Governance
              RZGiampaoli

              By Agent memory he means the java memory that you set in the agent when you set it in the odiparams.bat. That is the memory that the agent has available to use during the load.

               

              The thing is, when you go to the topology and create a "Data Server", for Oracle dbs, that means a DB instance.

              This is what ODI uses to figure out how to generate the code and if the data will pass trough the agent or not. 

              For example, if you have just 1 DB instance, and inside it you have 2 schema:

              If you set 1 data server and add 1 physical schema under it;

              and then you set another data server and the other schema under it...

              ODI will understand that you have 2 DB instances instead of one and the code that he will generates will use a LKM and a IKM, meaning:

              The agent will first select the data from schema 1 and put in the agent memory, then it'll load the data to you stage table.

              Then it'll do a insert into target select * from your stage table.

               

              If you set the topology with only one data server and put both schema under it (you need a user that has access to both schema to set the topology this way, normally something like ODI_RUN), then when you load something the code will be just insert into target select * from source, and the data will not pass through the agent at all.

              The agent will just send the code to the database and the database will do everything for you.

               

              Then, with the DB link, even if you have 2 db instances, ODI will create the DB link and he will understand the both are linked, and by that he'll just send the insert into target select * from source to the DB and the DB will do the rest, that is why it not uses any agent memory.

               

              Now, about the agent thing, you can only have one agent (when you execute it you need to choose which agent you will going to run, then I don't think this will going to work as you describe.

               

              And also, you always want to use the agent just as the maestro of you orchestra, never as the hard worker.

               

              Thanks!

              • 4. Re: ODI: DB Linked Based Load Strategy: Data Security and Data Governance
                3988035

                Thank you, well articulated and elaborated.