Forum Stats

  • 3,875,222 Users
  • 2,266,892 Discussions
  • 7,912,116 Comments

Discussions

Bulk load in GG

Annamalai A
Annamalai A Member Posts: 640
edited Oct 16, 2010 1:04AM in GoldenGate
Hi,

I have setup full database replication between one source and three target databases and its all working fine(Source db contains three countries data) , I need to add one more country's data in to source db , the size of the data will be around 300 million rows. Whether GG is capable for doing this activity simultaneously? 300 million rows insert operation need to be replicated in three target dbs ,apart from this, in source some dml operation also will happen that is also need to be replicated in target dbs?


Can any one give your view on this, whether GG can able to pull this much data to targets or is there any other options in GG?


Thanks in Advance.

Answers

  • -joe
    -joe Member Posts: 226
    annamalai,

    First, nice work on getting your 1:3 setup going. Regarding 300M rows, is this done in a single operation and if so can it be broken up? Number of rows is a little erroneous as each could be 100 bytes or 1GB so volume of redo log produced by the transaction is a better indicator. We have some customers replicating > 100GB of redo/hour on top end hardware.

    What I propose you do to see if your system can handle this is to do a test using archives from your source system. Copy a few hours worth of archive logs to a test system that is as close to production as possible. Setup a sqlnet connection from the test back to the production source to get the meta data (this will have little to zero impact on the source system as >99% of resources used in replication are spent on parsing redo and converting LCRs into the OGG universal portable format and writing them to trails). Running off the archives only is documented as Archive Log Only mode (ALO).

    Depending on the test system's capabilities - especially the IO system where the archive logs are stored - and how that compares with production, you can get a feel for how this will perform. Use the Unix dd to test the max read capabilities on the production archives and the archives copied to the test system to help understand the delta. IO will likely be your bottleneck and not CPU. However, when reading archive logs only and having several of them queued up there is no break in processing so the CPU usage will be higher than on production where transactions and hence the parsing of them comes in spurts.

    Good luck,
    -joe
  • Annamalai A
    Annamalai A Member Posts: 640
    Thanks Joe, This is a single load operation only but every 5000 rows there is a commit.
This discussion has been closed.