9 Replies Latest reply on Mar 17, 2014 11:10 PM by Arturo Gutierrez

    GG Architecture questions

    Arturo Gutierrez
      Hello,
      I'm starting to use GG and I have some doubts about its architecture.
      GG has the ability to extract only committed transactions, which is very good to avoid overloading the network or storage of information is not useful.
      The question I have is about doing when GG is running a large transaction, BATCH type that can generate dozens of online redo logs and therefore tens of archivers.
      How to manage such transactions GG?. This would be the case of long transactions in quantity. It could also happen that a transaction is alarge in time, for example, that begins to 9 in the morning and finalize at 11:30, and from that time the system has generated tens of redo logs.
      I also wanted to know some details of the mechanism used GG for extraction from the redo logs. Use LogMiner?


      Many Thanks for the help.
      regards
      Arturo

       

        • 1. Re: GG Architecture questions
          onkar.nath

          GG does not bother about the volumn of redo. it reads small and long running transaction in same way however in case of small transactions, the data replication will be almost in real-time however in case of long running transaction, there could be some lag i.e. delay in replicating the transactions and hence the queue will be build up. I would suggest you to go through GG documentation at least once.

           

          Onkar

          1 person found this helpful
          • 2. Re: GG Architecture questions
            Javier Morales

            Hi Arturo,

             

            All big transactions are retained until the commit is done. Consider that all locks were held in main database, so the transaction with dozens of redo logs affected can wait to appear (causing the respective delay in replicat process).

             

            No, GG doesn't use LogMiner. I've running GG extract processes in databases where logminer was disabled.

            GG reads the redolog files extracting the entire text of the transaction (remember that you have to enable supplemental logging for replication being performed without issues).

             

            Hope this helps!

            Javier

            1 person found this helpful
            • 3. Re: GG Architecture questions
              Arturo Gutierrez

              Hello,

              Browsing the details of CACHEMGR parameter, I understand the details about GG managing large transacctions.

               

              CACHEMGR

               

              Can generate a lot of files at C:\GGS\dirtmp (4k file size) if any large transacction is done and the virtual manager is restricted.

              I've simulated a memory stress with this parameter set at extraction file.

              CACHEMGR CACHESIZE 100MB

               

              Thanks

              Arturo

              1 person found this helpful
              • 4. Re: GG Architecture questions
                MikeN

                It's true that for periodic 'batch' processing you may want to handle replication differently. Depends on the type of loads you are dealing with.  In general, go with the simple set up, turn on replication & let it run.  However, if large batch processing loads are causing undesirable latencies, you do have the option of simply applying the batch to both the source & target manually. (You may also in this case choose to disable logging for the batch processing.)

                 

                For long-running transactions: this is very configurable in GG. You can either choose to be notified of long running tx's, skip them, or force-commit them. (Sometimes it's just a stupid interactive sql session that isn't really doing work, and stays open for days on end while someone goes on vacation.)  There is a mechanism called "bounded recovery" that you may review. Extract may be registered with RMAN to ensure logs are available to be processed in the event of backlog (be sure to unregister extract when deleting extract).

                 

                For details on extraction (aka "capture") of data: it depends. GG uses a variety of methods to get the data to replicate. See the GG for Oracle ref, admin &  install guides ( 5 Choosing Capture and Apply Modes ); this talks about "classic" vs "integrated" capture. But more generally, sometimes data comes from redo/archive logs (most of the time); sometimes it comes from the DB (known as "fetching", and uses flashback to get a consistent column value for the row currently being processed), and sometimes (if "integrated" capture) the logmining server will be used (the service built into the DB, which logminer uses -- hence, the term "integrated" capture, since it uses internal features of the database to get certain datatypes that can't be retrieved directly from redo in a meaningful way).

                 

                Message was edited by: MikeN

                1 person found this helpful
                • 5. Re: GG Architecture questions
                  Arturo Gutierrez

                  Mike,

                  I'm agree that disable  capture large transactions is an important option .

                  In the Oracle environment Strems there an option to disable the change data capture for a specific session , batch type using

                  dbms_streams.set_tag exec ( tag = > ' NO_replicar ) );  I think GG could handle the most optimal way ( LRTs ) (Large Running transaccions ) is an option to evaluate.

                  Also from what I 've seen, the LRT could generate significant buffer overflows in the source server, producen large ram memory allocations.I think integrated capture change the memory model to store at SGA (streams pool the LCR captured) limiting the size.


                  I have more experience using Oracle Streams, but now I need to make an upgrade to use GG , so I try to understand the architecture of GG , and of course the comparisons are inevitable.
                  On the possibility of process integrated extracion , I think it is an approach to the technology of Oracle Streams , generating LCR (Logical Change Records) to enable the capture of changes in compressed table , Exadata GCC and other data types not supported by the classic extract . But this option , I think it should not be as optimal as classical . Since there is more overhead and used log miner.

                  Thanks for the hints.


                  • 6. Re: GG Architecture questions
                    MikeN
                    Also from what I 've seen, the LRT could generate significant buffer overflows in the source server, producen large ram memory allocations.I think integrated capture change the memory model to store at SGA (streams pool the LCR captured) limiting the size.

                     

                     

                     

                    GoldenGate won't have a problem with large transactions & memory usage; to avoid running out of memory, GG automatically spills to disk (that's what dirtmp is for) until the data can be written to the output trail.

                    • 7. Re: GG Architecture questions
                      Arturo Gutierrez

                      Mike,

                       

                      About :

                      sometimes it comes from the DB (known as "fetching", and uses flashback to get a consistent column value for the row currently being processed)

                      .....

                       

                      According to the documentation, the user GG requires the privilege:

                      EXECUTE on package DBMS_FLASHBACK

                      Could you explain, why GG user does flashback operation?
                      Is not all necessary information to replicat in redo vectors?

                      Thanks
                      Arturo

                       

                      • 8. Re: GG Architecture questions
                        MikeN

                        About :

                        sometimes it comes from the DB (known as "fetching", and uses flashback to get a consistent column value for the row currently being processed)

                        .....

                        According to the documentation, the user GG requires the privilege:

                        > EXECUTE on package DBMS_FLASHBACK
                        >
                        > Could you explain, why GG user does flashback operation?
                        > Is not all necessary information to replicat in redo vectors?

                         

                         

                         

                        So, this has nothing to do with replicat... It just has to do with getting consistent info in extract.  (Again, "fetch" is optional, just to be clear. For example, if you have a table with columns (a, b, c), a is the pk, yet column c is also required by your target system (for some reason) on all updates... but yet it is not force-logged, then it wouldn't be in the redo logs during an update when its value did not change.)  Recall that GG is capturing changes from redo... and it's maintaining checkpoints so that no data is lost, even if GG isn't running.  Let's say you're currently up-to-date, no changes happening in the DB.  Then you stop the gg primary capture (extract) process... And then, on a single row do insert/update (commit), update/update (commit), update (commit) (etc).  Then start extract.... When reading the changes in the redo, you see the first insert (fine), and then see the first update... If you query (fetch) the DB for additional columns at this point, you'll get the latest value, not the first updated value.

                         

                        Regards,

                        -m

                         

                        Message was edited by: MikeN

                        1 person found this helpful
                        • 9. Re: GG Architecture questions
                          Arturo Gutierrez
                          Mike,
                          Thanks for the reply.
                          Although honestly I can not understand the use of flashback by Golden Gate.
                          As you know the use of flashback is based on UNDO blocks. And those blocks that can not ensure providing read consistency for high or if there is a small time DML important activity in the database. And to my knowledge there is no special requirement or GARANTEE_RETENTION UNDO_RETENTION.
                          What could happen if GG can not perform the operation of Flashback? ORA-01555 Snapshot too old?

                          When you say:
                          a is the pk, yet column c is also required by your target system (for some reason) on all updates. If we use the option cols to include some columns at extract level, GG create the adecuate supplemental logging to write in the redo logs this columns also.
                          What other conditions can generate the situation, where GG need a col without supplemetal logging enabled?

                          Many thanks
                          Best regards
                          Arturo