5 Replies Latest reply: Mar 8, 2013 9:14 AM by Chrisjenkins-Oracle RSS

    Transaction log building.

    930542
      Hi All:

      Replication is working correctly but we are seeing that Timesten is building logs on disks steadily.
      Please can somebody explain and confirm that the logs generated are normal or does it indicate any problem.

      Below I've provided values from

      - connection attribute
      - call ttLogHolds()
      - listing from transaction log dir.


      connection attribute
      #Set to 0, which enables parallel propagation
      ReplicationApplyOrdering=0

      #Should be no greater than half the value of the LogBufParallelism attribute.
      ReplicationParallelism=12

      # Should be euqal to the number of CPU or cores present in the machine.
      LogBufParallelism=24

      # should be 8 * (LogBufParallelism) larger value helps parallelism
      LogBufMB=192

      # should be >= LogBufMB
      LogFileSize=192

      #0 - Write data to the transaction log files using the previously used value.
      #1 (default) - Write data to transaction log files using buffered writes and use explicit sync operations as needed to sync log data to disk (for example with durable commits).
      #2 - Write data to transaction log files using synchronous writes such that explicit sync operations are not needed.
      LogFlushMethod=2

      #The permanent memory region - this is where your data is stored.
      PermSize = 500

      #If left unspecified, its value is determined from PermSize as follows:Tempsize = 40 MB + ceiling(PermSize / 8 MB)
      TempSize=103



      call ttLogHolds();
      < 1, 5230592, Checkpoint , TT.ds1 >
      < 1, 16435200, Checkpoint , TT.ds0 >
      < 1, 16632552, Replication , TTNODE5:TT:8 >
      < 1, 16632552, Replication , TTNODE3:TT:8 >
      < 1, 16632552, Replication , TTNODE4:TT:1 >
      < 1, 16632552, Replication , TTNODE4:TT:4 >
      < 1, 16632552, Replication , TTNODE4:TT:7 >
      < 1, 16632552, Replication , TTNODE4:TT:10 >
      < 1, 16632552, Replication , TTNODE1:TT:1 >
      < 1, 16632552, Replication , TTNODE1:TT:4 >
      < 1, 16632552, Replication , TTNODE1:TT:7 >
      < 1, 16632552, Replication , TTNODE1:TT:9 >
      < 1, 16632552, Replication , TTNODE1:TT:10 >
      < 1, 16632552, Replication , TTNODE1:TT:11 >
      < 1, 16632552, Replication , TTNODE2:TT:2 >
      < 1, 16632552, Replication , TTNODE2:TT:6 >
      < 1, 16632552, Replication , TTNODE2:TT:8 >
      < 1, 16722664, Replication , TTNODE5:TT:2 >
      < 1, 16722664, Replication , TTNODE5:TT:6 >
      < 1, 16722664, Replication , TTNODE3:TT:2 >
      < 1, 16722664, Replication , TTNODE3:TT:3 >
      < 1, 16722664, Replication , TTNODE3:TT:5 >
      < 1, 16722664, Replication , TTNODE3:TT:6 >
      < 1, 16722664, Replication , TTNODE4:TT:9 >
      < 1, 16722664, Replication , TTNODE4:TT:11 >
      < 1, 16722664, Replication , TTNODE1:TT:8 >
      < 1, 16722664, Replication , TPTTP1:TT:2 >
      < 1, 16722664, Replication , TPTTP1:TT:5 >
      < 1, 16722664, Replication , TPTTP1:TT:9 >
      < 1, 16722664, Replication , TTNODE2:TT:1 >
      < 1, 16722664, Replication , TTNODE2:TT:3 >
      < 1, 16722664, Replication , TTNODE2:TT:4 >
      < 1, 16722664, Replication , TTNODE2:TT:5 >
      < 1, 16722664, Replication , TTNODE2:TT:7 >
      < 1, 16722664, Replication , TTNODE2:TT:9 >
      < 1, 16722664, Replication , TTNODE2:TT:10 >
      < 1, 16722664, Replication , TTNODE2:TT:11 >
      < 1, 16827112, Replication , TTNODE1:TT:0 >
      < 1, 16827112, Replication , TTNODE3:TT:0 >
      < 1, 16827112, Replication , TTNODE4:TT:0 >
      < 1, 16827112, Replication , TTNODE5:TT:0 >
      < 1, 16827112, Replication , TPTTP1:TT:0 >
      < 1, 16827112, Replication , TTNODE2:TT:0 >
      < 1, 16827112, Replication , TTNODE5:TT:1 >
      < 1, 16827112, Replication , TTNODE5:TT:3 >
      < 1, 16827112, Replication , TTNODE5:TT:4 >
      < 1, 16827112, Replication , TTNODE5:TT:5 >
      < 1, 16827112, Replication , TTNODE5:TT:7 >
      < 1, 16827112, Replication , TTNODE5:TT:9 >
      < 1, 16827112, Replication , TTNODE5:TT:10 >
      < 1, 16827112, Replication , TTNODE5:TT:11 >
      < 1, 16827112, Replication , TTNODE3:TT:1 >
      < 1, 16827112, Replication , TTNODE3:TT:4 >
      < 1, 16827112, Replication , TTNODE3:TT:7 >
      < 1, 16827112, Replication , TTNODE3:TT:9 >
      < 1, 16827112, Replication , TTNODE3:TT:10 >
      < 1, 16827112, Replication , TTNODE3:TT:11 >
      < 1, 16827112, Replication , TTNODE4:TT:2 >
      < 1, 16827112, Replication , TTNODE4:TT:3 >
      < 1, 16827112, Replication , TTNODE4:TT:5 >
      < 1, 16827112, Replication , TTNODE4:TT:6 >
      < 1, 16827112, Replication , TTNODE4:TT:8 >
      < 1, 16827112, Replication , TTNODE1:TT:2 >
      < 1, 16827112, Replication , TTNODE1:TT:3 >
      < 1, 16827112, Replication , TTNODE1:TT:5 >
      < 1, 16827112, Replication , TTNODE1:TT:6 >
      < 1, 16827112, Replication , TPTTP1:TT:1 >
      < 1, 16827112, Replication , TPTTP1:TT:3 >
      < 1, 16827112, Replication , TPTTP1:TT:4 >
      < 1, 16827112, Replication , TPTTP1:TT:6 >
      < 1, 16827112, Replication , TPTTP1:TT:7 >
      < 1, 16827112, Replication , TPTTP1:TT:8 >
      < 1, 16827112, Replication , TPTTP1:TT:10 >
      < 1, 16827112, Replication , TPTTP1:TT:11 >
      74 rows found.

      Here's the listing from transaction log dir
      total 4169608
      -rw-rw---- 1 oracle oinstall 201326592 Feb 22 22:44 TT.res0
      -rw-rw---- 1 oracle oinstall 201326592 Feb 22 22:45 TT.res1
      -rw-rw---- 1 oracle oinstall 201326592 Feb 22 22:45 TT.res2
      -rw-rw---- 1 oracle oinstall 201326592 Feb 25 23:24 TT.log8
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 02:57 TT.log9
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 06:30 TT.log10
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 10:02 TT.log11
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 13:34 TT.log12
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 17:06 TT.log13
      -rw-rw---- 1 oracle oinstall 201326592 Feb 26 20:38 TT.log14
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 00:24 TT.log15
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 04:49 TT.log16
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 09:13 TT.log17
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 13:39 TT.log18
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 18:03 TT.log19
      -rw-rw---- 1 oracle oinstall 201326592 Feb 27 22:30 TT.log20
      -rw-rw---- 1 oracle oinstall 201326592 Feb 28 03:00 TT.log21
      -rw-rw---- 1 oracle oinstall 201326592 Feb 28 07:30 TT.log22
      -rw-rw---- 1 oracle oinstall 201326592 Feb 28 12:01 TT.log23
      -rw-rw---- 1 oracle oinstall 201326592 Feb 28 16:35 TT.log24
      -rw-rw---- 1 oracle oinstall 24639488 Feb 28 17:10 TT.log25
      -rw-rw---- 1 oracle oinstall 201326592 Feb 28 20:05 TT.log0
      -rw-rw---- 1 oracle oinstall 12912640 Feb 28 20:16 TT.log1


      Thanks
      Ved

      Edited by: vedgsingh on Feb 28, 2013 2:08 PM
        • 1. Re: Transaction log building.
          Chrisjenkins-Oracle
          Can you please provide the following:

          1. Full replication scheme definition and DSN definition

          2. Details on workload (how many updates / second etc.)

          3. Info on hardware spec. (number, type and speed of CPU cores, details of disk storage type / layout)

          Thanks,

          Chris
          • 2. Re: Transaction log building.
            930542
            Hi Chris:
            Here's the info that you have asked for.

            Also i have a another contextual question.
            There's lot of ttJdbc-*.so files are getting created under /tmp location on linux. What are those and what can be the cause for these files. They might eat up the file descriptor limit on the node.


            *1. Full replication scheme definition and DSN definition*

            Create replication REQUEST_REP
            element E1 table REQUEST_STORE
            master TT on "TTNODE1"
            SUBSCRIBER TT on "TTNODE2",
            TT on "TTNODE3",
            TT on "TTNODE4",
                                TT on "TTNODE5",
                                TT on "TPTTP1",
            TT on "TPTTP2"
            element E2 table REQUEST_STORE
            master TT on "TTNODE2"
            SUBSCRIBER TT on "TTNODE1",
            TT on "TTNODE3",
            TT on "TTNODE4",
                                TT on "TTNODE5",
                                TT on "TPTTP1",
            TT on "TPTTP2"
            element E3 table REQUEST_STORE
            master TT on "TTNODE3"
            SUBSCRIBER TT on "TTNODE2",
            TT on "TTNODE1",
            TT on "TTNODE4",
                                TT on "TTNODE5",
                                TT on "TPTTP1",
            TT on "TPTTP2"
            element E4 table REQUEST_STORE
            master TT on "TTNODE4"
            SUBSCRIBER TT on "TTNODE2",
            TT on "TTNODE3",
            TT on "TTNODE1",
                                TT on "TTNODE5",
                                TT on "TPTTP1",
            TT on "TPTTP2"
            element E5 table REQUEST_STORE
            master TT on "TTNODE5"
            SUBSCRIBER TT on "TTNODE2",
            TT on "TTNODE3",
            TT on "TTNODE4",
                                TT on "TTNODE1",
                                TT on "TPTTP1",
            TT on "TPTTP2"
            store TT on "TTNODE1"
            port 53589
            store TT on "TTNODE2"
            port 53589
            store TT on "TTNODE3"
            port 53589
            store TT on "TTNODE4"
            port 53589
            store TT on "TTNODE5"
            port 53589
            store TT on "TPTTP1"
            port 53589
            store TT on "TPTTP2"
            port 53589;

            DSN Definition

            [TT]
            Driver=/home/oracle/product/11.2.2.4/TimesTen/tt11224/lib/libtten.so
            DataStore=/home/oracle/product/11.2.2/TimesTen/tt/datastores/TT
            LogDir=/home/oracle/product/11.2.2.4/TimesTen/tt11224/logs

            *2. Details on workload (how many updates / second etc.)*

            We are currently testing with around 30 inserts per second, which we expect to replicate.


            *3. Info on hardware spec. (number, type and speed of CPU cores, details of disk storage type / layout)*

            Architecture - x86_64 x86_64 x86_64 GNU/Linux
            Node Type - Physical
            Processor - 64bit, 24 core
            Model (clock speed) - Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
            RAM - 32GB
            • 3. Re: Transaction log building.
              Chrisjenkins-Oracle
              I don't know what those ttJdbc-*.so files are nor why they are being created. Can you provide and 'ls -l' of the files and also what does the 'file' command say about one of them?

              The DSN definition you provided does not look like the full definition. Many attributes are missing. Is this really the definition you have? Also, I see that you are storing the database transaction log files within the product installation tree. This is a very bad idea. AN uninstall or upgrade of the TimesTen software may well remove those files without warning rendering your database unusable and unrecoverable (unless you have a backup). NEVER store database files within the product install tree.

              The issue here I think is that you have 'over configured' replication. You have 7 datastores involved in the replication scheme and you say that you have specified ReplicationParallelism=12 (though the DSN definition does not show this) This means that each machine will be running 144 replication threads (12 senders and 12 receivers for each peer store). That's a lot of threads to support a replicated throughput of 30 TPS. For that level of throughput you don't need any parallelism and i would suggest reducing the setting to 1 (you will have to drop and re-create the database to change this unfortunately). Also, a log buffer of 192 MB is quite small especially for a replicated system. You should try increasing that to 256 MB or even 512 MB. Then test again and see how things behave.

              Note that if any of the peer stores become unavailable then TimesTen will accumulate logs as the log data is needed to resync the unavailable peer(s) when they become available again. You can prevent this by specifying the LOGTHRESHOLD option on each STORE clause. In that case once the threshold is exceeded that peer is marked as 'failed' and must then be recovered/resync'd by a full 'duplicate' from an authoritative copy.

              Chris

              Edited by: ChrisJenkins on Mar 7, 2013 10:44 AM
              • 4. Re: Transaction log building.
                930542
                Here's the complete DSN definition

                [TT]
                Driver=/home/oracle/product/11.2.2.4/TimesTen/tt11224/lib/libtten.so
                DataStore=/home/oracle/product/11.2.2/TimesTen/tt/datastores/TT
                LogDir=/home/oracle/product/11.2.2.4/TimesTen/tt11224/logs
                DatabaseCharacterSet=AL32UTF8
                #Set to 0, which enables parallel propagation
                ReplicationApplyOrdering=0
                #Should be no greater than half the value of the LogBufParallelism attribute.
                ReplicationParallelism=12
                # Should be euqal to the number of CPU or cores present in the machine.
                LogBufParallelism=24
                # should be 8 * (LogBufParallelism) larger value helps parallelism
                LogBufMB=192
                # should be >= LogBufMB
                LogFileSize=192
                #0 - Write data to the transaction log files using the previously used value.
                #1 (default) - Write data to transaction log files using buffered writes and use explicit sync operations as needed to sync log data to disk (for example with durable commits).
                #2 - Write data to transaction log files using synchronous writes such that explicit sync operations are not needed.
                LogFlushMethod=2
                #The permanent memory region - this is where your data is stored.
                PermSize = 500
                #If left unspecified, its value is determined from PermSize as follows:Tempsize = 40 MB + ceiling(PermSize / 8 MB)
                TempSize=103


                You mentioned that the replicaiton is 'over configured' for the load that we are currently testing. I think we are fine with the 'over-configured' setting provided it is not wrong.

                Our main concern is regarding the transaction log files getting accumulated and filling the disk space, which you said is because of the unavailability of peer and which can changed by using LOGTHRESHOLD option.

                Im going to try that. Also, i will correct the issue you pointed out regarding logs directory.


                Thanks
                Ved
                • 5. Re: Transaction log building.
                  Chrisjenkins-Oracle
                  I would contend that the over configured parallel replication is 'wrong' or at least inadvisable. There are not enough cores in the machine to properly service all these threads and under heavier load you will just be adding unnecessary contention and synchronisation between the threads. This is overhead which brings no benefit and may even actually hurt.

                  When tuning the LogBufMB the objective is that under normal conditions (all peers are available and replication is fully operational) the values of SYS.MONITOR.LOG_BUFFER_WAITS and SYS.MONITOR.LOG_FS_READS are zero and stay at zero. It is acceptable to get a see an occasional, small increment when under heavy load but any significant/increasing non-zero values for these metrics indicate a performance issue.

                  Chris