14 Replies Latest reply: Mar 7, 2012 7:35 AM by 742833 RSS

    Crazy cleaner.

    742833
      Hi all,

      I'm having problems with the space used by one of our replicated environments (that has been working fine until a few days ago) :-(

      DbSpace returns the following results:
      Server  Size (KB)  % Used  # files with 0% usage
      ------  ---------  ------  ---------------------
      ---01   124521372      19                   1359
      ---02    51682403      39                     18 (MASTER)
      ---03    71227808      29                    454
      ---04    42343268      45                      5
      Why is it under the 50% of minUtilization on all the cases?
      Also it seems that cleaner is always running on server 01 since each time DbSpace is executed the following trace is added to je.info.0:
      20227 13:54:51:118 WARNING [apebe01] Cleaner has 12 files not deleted because of read-only processes.
      120227 13:54:56:690 WARNING [apebe01] Cleaner has 19 files not deleted because of read-only processes.
      120227 13:55:08:447 WARNING [apebe01] Cleaner has 42 files not deleted because of read-only processes.
      120227 13:55:18:833 WARNING [apebe01] Cleaner has 56 files not deleted because of read-only processes.
      120227 16:00:01:812 WARNING [apebe01] Cleaner has 1 files not deleted because of read-only processes.
      120227 16:00:19:532 WARNING [apebe01] Cleaner has 4 files not deleted because of read-only processes.
      120227 17:24:44:916 WARNING [apebe01] Cleaner has 2 files not deleted because of read-only processes.
      120227 17:24:55:416 WARNING [apebe01] Cleaner has 4 files not deleted because of read-only processes.
      120227 18:33:21:580 WARNING [apebe01] Cleaner has 5 files not deleted because of read-only processes.
      120227 19:34:33:132 WARNING [apebe01] Cleaner has 3 files not deleted because of read-only processes.
      The execution environment on all servers is the same:
      h4. je.properties:
      je.cleaner.minUtilization=50
      je.cleaner.expunge=true
      je.cleaner.minFileUtilization=15
      je.maxMemory=536870912
      je.sharedCache=true
      je.rep.feederTimeout=7 s
      .maxClockDelta=60 s
      h4. Java
      java version "1.6.0_30"
      Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
      Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)
      h5. JVM modifiers
      -XX:ParallelGCThreads=2 -XX:-ReduceInitialCardMarks -d64 -server -XX:+UseCompressedOops -XX:NewSize=131m
      -Xms300m -Xmx2g -XX:PermSize=64m -XX:MaxPermSize=128m -XX:-OmitStackTraceInFastThrow
      OS: Red Hat Enterprise Linux Server release 5.6 (Tikanga)

      The stats is incomprensible for me but I have noticed that "cacheTotalBytes" is too low in server 01.
      In addition to that, "cleanerBackLog=0" except for 01 where "cleanerBackLog=1,280".

      Any help?

      BR,
      /César.
        • 1. Re: Crazy cleaner.
          742833
          h2. Server 01 STATS
          Mon Feb 27 15:51:23 CET 2012
          Stats DB_AUDITING_BATCH
          I/O: Log file opens, fsyncs, reads, writes, cache misses.
               bufferBytes=3,145,728
                    Total memory currently consumed by log buffers, in bytes.
               endOfLog=0xe66d3/0x757d5a
                    The location of the next entry to be written to the log.
               nBytesReadFromWriteQueue=0
                    Number of bytes read to fulfill file read operations by reading out of the pending write queue.
               nBytesWrittenFromWriteQueue=864,218,816
                    Number of bytes written from the pending write queue.
               nCacheMiss=1,414,120,243
                    Total number of requests for database objects which were not in memory.
               nFSyncRequests=942,382
                    Number of fsyncs requested through the group commit manager for actions such as transaction commits and checkpoints.
               nFSyncTimeouts=0
                    Number of fsyncs requests submitted to the group commit manager for actions such as transaction commmits and checkpoints which timed out.
               nFSyncs=942,382
                    Number of fsyncs issued through the group commit manager for actions such as transaction commits and checkpoints. A subset of nLogFsyncs.
               nFileOpens=690,404,265
                    Number of times a log file has been opened.
               nLogBuffers=3
                    Number of log buffers currently instantiated.
               nLogFSyncs=1,068,296
                    Total number of fsyncs of the JE log. This includes those fsyncs recorded under the nFsyncs stat
               nNotResident=1,449,517,715
                    Number of request for database objects not contained within the in memory data structure.
               nOpenFiles=100
                    Number of files currently open in the file cache.
               nRandomReadBytes=4,215,305,916,747
                    Number of bytes read which required respositioning the disk head more than 1MB from the previous file position.
               nRandomReads=1,915,374,312
                    Number of disk reads which required respositioning the disk head more than 1MB from the previous file position.
               nRandomWriteBytes=1,082,799,311,212
                    Number of bytes written which required respositioning the disk head more than 1MB from the previous file position.
               nRandomWrites=2,005,856
                    Number of disk writes which required respositioning the disk head by more than 1MB from the previous file position.
               nReadsFromWriteQueue=0
                    Number of file read operations which were fulfilled by reading out of the pending write queue.
               nRepeatFaultReads=120,143,516
                    Number of reads which had to be repeated when faulting in an object from disk because the read chunk size controlled by je.log.faultReadSize is too small.
               nSequentialReadBytes=1,552,661,895,876
                    Number of bytes read which did not require respositioning the disk head more than 1MB from the previous file position.
               nSequentialReads=395,500,298
                    Number of disk reads which did not require respositioning the disk head more than 1MB from the previous file position.
               nSequentialWriteBytes=165,422,173,481
                    Number of bytes written which did not require respositioning the disk head more than 1MB from the previous file position.
               nSequentialWrites=240,807
                    Number of disk writes which did not require respositioning the disk head by more than 1MB from the previous file position.
               nTempBufferWrites=0
                    Number of writes which had to be completed using the temporary marshalling buffer because the fixed size log buffers specified by je.log.totalBufferBytes and je.log.numBuffers were not large enough.
               nWriteQueueOverflow=13
                    Number of write operations which would overflow the Write Queue.
               nWriteQueueOverflowFailures=0
                    Number of write operations which would overflow the Write Queue and could not be queued.
               nWritesFromWriteQueue=4,659
                    Number of file write operations executed from the pending write queue.
          Cache: Current size, allocations, and eviction activity.
               adminBytes=2,691,127
                    Number of bytes of JE cache used for log cleaning metadata and other administrative structure, in bytes.
               avgBatchCACHEMODE=0
                    Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
               avgBatchCRITICAL=0
                    Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
               avgBatchDAEMON=0
                    Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
               avgBatchEVICTORTHREAD=0
                    Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
               avgBatchMANUAL=0
                    Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
               cacheTotalBytes=54,678,455
                    Total amount of JE cache in use, in bytes.
               dataBytes=48,840,928
                    Amount of JE cache used for holding data, keys and internal Btree nodes, in bytes.
               lockBytes=672
                    Number of bytes of JE cache used for holding locks and transactions, in bytes.
               nBINsEvictedCACHEMODE=0
                    Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nBINsEvictedCRITICAL=6,001,025
                    Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nBINsEvictedDAEMON=214,700,177
                    Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nBINsEvictedEVICTORTHREAD=509,456,291
                    Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nBINsEvictedMANUAL=0
                    Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nBINsFetch=5,529,384,037
                    Number of BINs (bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
               nBINsFetchMiss=729,692,390
                    Number of BINs (bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
               nBINsStripped=193,442,670
                    The number of BINs for which the child LNs have been removed (stripped) and are no longer in the cache. BIN stripping is the most efficient form of eviction.
               nBatchesCACHEMODE=0
                    Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
               nBatchesCRITICAL=0
                    Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
               nBatchesDAEMON=0
                    Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
               nBatchesEVICTORTHREAD=0
                    Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
               nBatchesMANUAL=0
                    Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
               nCachedBINs=81,019
                    Number of BINs (bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
               nCachedUpperINs=28,732
                    Number of upper INs (non-bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
               nEvictPasses=37,303,328
                    Number of eviction passes, an indicator of the eviction activity level.
               nINCompactKey=70,054
                    Number of INs that use a compact key representation to minimize the key object representation overhead.
               nINNoTarget=12,007
                    Number of INs that use a compact representation when none of its child nodes arein the cache.
               nINSparseTarget=80,313
                    Number of INs that use a compact sparse array representation to point to child nodes in the cache.
               nLNsFetch=4,782,010,129
                    Number of LNs (data records) requested by btree operations. Can be used to gauge cache hit/miss ratios.
               nLNsFetchMiss=84,311,176
                    Number of LNs (data records) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
               nNodesEvicted=1,407,436,862
                    Number of nodes selected and removed from the cache.
               nNodesScanned=37,600,911,969
                    Number of nodes scanned in order to select the eviction set, an indicator of eviction overhead.
               nNodesSelected=1,602,055,463
                    Number of nodes which pass the first criteria for eviction, an indicator of eviction efficiency. nNodesExplicitlyEvicted plus nBINsStripped will roughly equal nNodesSelected.  nNodesSelected will be somewhat larger than the sum because some selected nodes don't pass a final screening.
               nRootNodesEvicted=296,896
                    Number of database root nodes evicted.
               nSharedCacheEnvironments=6
                    Number of Environments sharing the cache.
               nThreadUnavailable=392,132,162
                    Number of eviction tasks that were submitted to the background evictor pool, but were refused because all eviction threads were busy. The user may want to change the size of the evictor pool through the je.evictor.*threads properties.
               nUpperINsEvictedCACHEMODE=0
                    Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nUpperINsEvictedCRITICAL=5,315,035
                    Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nUpperINsEvictedDAEMON=198,722,920
                    Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nUpperINsEvictedEVICTORTHREAD=473,896,543
                    Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nUpperINsEvictedMANUAL=0
                    Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
               nUpperINsFetch=7,199,693,730
                    Number of Upper INs (non-bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
               nUpperINsFetchMiss=677,547,101
                    Number of Upper INs (non-bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
               requiredEvictBytes=0
                    Number of bytes we need to evict in order to get under budget.
               sharedCacheTotalBytes=536,497,201
                    Total amount of the shared JE cache in use, in bytes.
          Cleaning: Frequency and extent of log file cleaning activity.
               cleanerBackLog=1,280
                    Number of files to be cleaned to reach the target utilization.
               fileDeletionBacklog=0
                    Number of files that are ready to be deleted.
               nCleanerDeletions=117,386
                    Number of cleaner file deletions this session.
               nCleanerEntriesRead=3,099,288,490
                    Accumulated number of log entries read by the cleaner.
               nCleanerRuns=2,181,139
                    Number of cleaner runs this session.
               nClusterLNsProcessed=0
                    Accumulated number of LNs processed because they qualify for clustering.
               nINsCleaned=365,756,719
                    Accumulated number of INs cleaned.
               nINsDead=6,051,426
                    Accumulated number of INs that were not found in the tree anymore (deleted).
               nINsMigrated=359,705,293
                    Accumulated number of INs migrated.
               nINsObsolete=1,017,933,003
                    Accumulated number of INs obsolete.
               nLNQueueHits=1,071,160,191
                    Accumulated number of LNs processed without a tree lookup.
               nLNsCleaned=1,665,858,842
                    Accumulated number of LNs cleaned.
               nLNsDead=649
                    Accumulated number of LNs that were not found in the tree anymore (deleted).
               nLNsLocked=44
                    Accumulated number of LNs encountered that were locked.
               nLNsMarked=1,665,858,745
                    Accumulated number of LNs that were marked for migration during cleaning.
               nLNsMigrated=1,665,872,695
                    Accumulated number of LNs that were marked for migration during cleaning.
               nLNsObsolete=45,955,771
                    Accumulated number of LNs obsolete.
               nMarkLNsProcessed=1,665,759,016
                    Accumulated number of LNs processed because they were previously marked for migration.
               nPendingLNsLocked=32
                    Accumulated number of pending LNs that could not be locked for migration because of a long duration application lock.
               nPendingLNsProcessed=76
                    Accumulated number of LNs processed because they were previously locked.
               nRepeatIteratorReads=0
                    Number of attempts to read a log entry larger than the read buffer size during which the log buffer couldn't be grown enough to accommodate the object.
               nToBeCleanedLNsProcessed=71,871
                    Accumulated number of LNs processed because they are soon to be cleaned.
               totalLogSize=114,961,255,513
                    Approximation of the total log size in bytes.
          Node Compression: Removal and compression of internal btree nodes.
               cursorsBins=379
                    Number of BINs encountered by the INComprssor that had cursors referring to them when the compresor ran.
               dbClosedBins=0
                    Number of BINs encountered by the INCompressor that had their database closed between the time they were put on the compressor queue and when the compressor ran.
               inCompQueueSize=0
                    Number of entries in the INCompressor queue when the getStats() call was made.
               nonEmptyBins=0
                    Number of BINs encountered by the INCompressor that were not actually empty when the compressor ran.
               processedBins=333,101
                    Number of BINs that were successfully processed by the INCompressor.
               splitBins=1,868
                    Number of BINs encountered by the INCompressor that were split between the time they were put on the comprssor queue and when the compressor ran.
          Checkpoints: Frequency and extent of checkpointing activity.
               lastCheckpointEnd=0xe66d2/0x6ac22
                    Location in the log of the last checkpoint end.
               lastCheckpointId=223,395
                    Id of the last checkpoint.
               lastCheckpointStart=0xe66d0/0x842ea6
                    Location in the log of the last checkpont start.
               nCheckpoints=17,240
                    Total number of checkpints run so far.
               nDeltaINFlush=222,435,462
                    Accumulated number of Delta INs flushed to the log.
               nFullBINFlush=124,985,010
                    Accumulated number of full BINs flushed to the log.
               nFullINFlush=271,383,073
                    Accumulated number of full INs flushed to the log.
          Environment: General environment wide statistics.
               btreeRelatchesRequired=18,759,022
                    Returns the number of btree latch upgrades required while operating on this Environment. A measurement of contention.
          Locks: Locks held by data operations, latching contention on lock table.
               nLatchAcquireNoWaitUnsuccessful=0
                    Number of unsuccessful acquireNoWait() calls.
               nLatchAcquiresNoWaitSuccessful=0
                    Number of times acquireNoWait() was called when the latch was successfully acquired.
               nLatchAcquiresNoWaiters=0
                    Number of times the latch was acquired without contention.
               nLatchAcquiresSelfOwned=0
                    Number of times the latch was acquired it was already owned by the caller.
               nLatchAcquiresWithContention=0
                    Number of times the latch was acquired when it was already owned by another thread.
               nLatchReleases=0
                    Number of latch releases.
               nOwners=6
                    Number of lock owners in lock table.
               nReadLocks=6
                    Number of read locks currently held.
               nRequests=6,142,322,748
                    Number of times a lock request was made.
               nTotalLocks=6
                    Number of locks current in lock table.
               nWaiters=0
                    Number of transactions waiting for a lock.
               nWaits=78,972
                    Number of times a lock request blocked.
               nWriteLocks=0
                    Number of write locks currently held.
          • 2. Re: Crazy cleaner.
            742833
            h2. Server 02 STATS
            Mon Feb 27 15:51:52 CET 2012
            Stats DB_AUDITING_BATCH
            I/O: Log file opens, fsyncs, reads, writes, cache misses.
                 bufferBytes=3,145,728
                      Total memory currently consumed by log buffers, in bytes.
                 endOfLog=0xd391e/0x1b6096
                      The location of the next entry to be written to the log.
                 nBytesReadFromWriteQueue=183,513
                      Number of bytes read to fulfill file read operations by reading out of the pending write queue.
                 nBytesWrittenFromWriteQueue=9,885,909,739
                      Number of bytes written from the pending write queue.
                 nCacheMiss=1,781,584,330
                      Total number of requests for database objects which were not in memory.
                 nFSyncRequests=1,326,346
                      Number of fsyncs requested through the group commit manager for actions such as transaction commits and checkpoints.
                 nFSyncTimeouts=14
                      Number of fsyncs requests submitted to the group commit manager for actions such as transaction commmits and checkpoints which timed out.
                 nFSyncs=1,267,106
                      Number of fsyncs issued through the group commit manager for actions such as transaction commits and checkpoints. A subset of nLogFsyncs.
                 nFileOpens=870,803,482
                      Number of times a log file has been opened.
                 nLogBuffers=3
                      Number of log buffers currently instantiated.
                 nLogFSyncs=1,409,716
                      Total number of fsyncs of the JE log. This includes those fsyncs recorded under the nFsyncs stat
                 nNotResident=1,806,109,250
                      Number of request for database objects not contained within the in memory data structure.
                 nOpenFiles=100
                      Number of files currently open in the file cache.
                 nRandomReadBytes=5,249,041,827,686
                      Number of bytes read which required respositioning the disk head more than 1MB from the previous file position.
                 nRandomReads=2,407,476,613
                      Number of disk reads which required respositioning the disk head more than 1MB from the previous file position.
                 nRandomWriteBytes=1,215,780,175,957
                      Number of bytes written which required respositioning the disk head more than 1MB from the previous file position.
                 nRandomWrites=2,421,891
                      Number of disk writes which required respositioning the disk head by more than 1MB from the previous file position.
                 nReadsFromWriteQueue=36
                      Number of file read operations which were fulfilled by reading out of the pending write queue.
                 nRepeatFaultReads=149,208,594
                      Number of reads which had to be repeated when faulting in an object from disk because the read chunk size controlled by je.log.faultReadSize is too small.
                 nSequentialReadBytes=1,959,745,934,637
                      Number of bytes read which did not require respositioning the disk head more than 1MB from the previous file position.
                 nSequentialReads=509,859,300
                      Number of disk reads which did not require respositioning the disk head more than 1MB from the previous file position.
                 nSequentialWriteBytes=203,961,005,641
                      Number of bytes written which did not require respositioning the disk head more than 1MB from the previous file position.
                 nSequentialWrites=380,090
                      Number of disk writes which did not require respositioning the disk head by more than 1MB from the previous file position.
                 nTempBufferWrites=0
                      Number of writes which had to be completed using the temporary marshalling buffer because the fixed size log buffers specified by je.log.totalBufferBytes and je.log.numBuffers were not large enough.
                 nWriteQueueOverflow=147
                      Number of write operations which would overflow the Write Queue.
                 nWriteQueueOverflowFailures=0
                      Number of write operations which would overflow the Write Queue and could not be queued.
                 nWritesFromWriteQueue=320,291
                      Number of file write operations executed from the pending write queue.
            Cache: Current size, allocations, and eviction activity.
                 adminBytes=1,223,838
                      Number of bytes of JE cache used for log cleaning metadata and other administrative structure, in bytes.
                 avgBatchCACHEMODE=0
                      Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                 avgBatchCRITICAL=67
                      Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                 avgBatchDAEMON=51
                      Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                 avgBatchEVICTORTHREAD=40
                      Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                 avgBatchMANUAL=0
                      Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                 cacheTotalBytes=379,845,782
                      Total amount of JE cache in use, in bytes.
                 dataBytes=375,475,432
                      Amount of JE cache used for holding data, keys and internal Btree nodes, in bytes.
                 lockBytes=784
                      Number of bytes of JE cache used for holding locks and transactions, in bytes.
                 nBINsEvictedCACHEMODE=0
                      Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nBINsEvictedCRITICAL=7,811,227
                      Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nBINsEvictedDAEMON=269,119,658
                      Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nBINsEvictedEVICTORTHREAD=646,281,903
                      Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nBINsEvictedMANUAL=0
                      Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nBINsFetch=7,215,199,293
                      Number of BINs (bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                 nBINsFetchMiss=922,602,512
                      Number of BINs (bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                 nBINsStripped=272,808,371
                      The number of BINs for which the child LNs have been removed (stripped) and are no longer in the cache. BIN stripping is the most efficient form of eviction.
                 nBatchesCACHEMODE=0
                      Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                 nBatchesCRITICAL=19,024
                      Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                 nBatchesDAEMON=555,310
                      Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                 nBatchesEVICTORTHREAD=1,891,528
                      Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                 nBatchesMANUAL=26
                      Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                 nCachedBINs=114,998
                      Number of BINs (bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                 nCachedUpperINs=64,783
                      Number of upper INs (non-bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                 nEvictPasses=47,183,656
                      Number of eviction passes, an indicator of the eviction activity level.
                 nINCompactKey=71,382
                      Number of INs that use a compact key representation to minimize the key object representation overhead.
                 nINNoTarget=7,737
                      Number of INs that use a compact representation when none of its child nodes arein the cache.
                 nINSparseTarget=155,754
                      Number of INs that use a compact sparse array representation to point to child nodes in the cache.
                 nLNsFetch=6,395,324,786
                      Number of LNs (data records) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                 nLNsFetchMiss=93,327,185
                      Number of LNs (data records) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                 nNodesEvicted=1,765,016,019
                      Number of nodes selected and removed from the cache.
                 nNodesScanned=47,941,123,823
                      Number of nodes scanned in order to select the eviction set, an indicator of eviction overhead.
                 nNodesSelected=2,038,593,039
                      Number of nodes which pass the first criteria for eviction, an indicator of eviction efficiency. nNodesExplicitlyEvicted plus nBINsStripped will roughly equal nNodesSelected.  nNodesSelected will be somewhat larger than the sum because some selected nodes don't pass a final screening.
                 nRootNodesEvicted=116,376
                      Number of database root nodes evicted.
                 nSharedCacheEnvironments=6
                      Number of Environments sharing the cache.
                 nThreadUnavailable=496,497,482
                      Number of eviction tasks that were submitted to the background evictor pool, but were refused because all eviction threads were busy. The user may want to change the size of the evictor pool through the je.evictor.*threads properties.
                 nUpperINsEvictedCACHEMODE=0
                      Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nUpperINsEvictedCRITICAL=6,703,355
                      Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nUpperINsEvictedDAEMON=244,788,403
                      Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nUpperINsEvictedEVICTORTHREAD=591,335,785
                      Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nUpperINsEvictedMANUAL=0
                      Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                 nUpperINsFetch=9,078,086,877
                      Number of Upper INs (non-bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                 nUpperINsFetchMiss=842,299,134
                      Number of Upper INs (non-bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                 requiredEvictBytes=0
                      Number of bytes we need to evict in order to get under budget.
                 sharedCacheTotalBytes=536,708,282
                      Total amount of the shared JE cache in use, in bytes.
            Cleaning: Frequency and extent of log file cleaning activity.
                 cleanerBackLog=0
                      Number of files to be cleaned to reach the target utilization.
                 fileDeletionBacklog=0
                      Number of files that are ready to be deleted.
                 nCleanerDeletions=142,052
                      Number of cleaner file deletions this session.
                 nCleanerEntriesRead=3,844,023,699
                      Accumulated number of log entries read by the cleaner.
                 nCleanerRuns=3,246,063
                      Number of cleaner runs this session.
                 nClusterLNsProcessed=0
                      Accumulated number of LNs processed because they qualify for clustering.
                 nINsCleaned=472,083,030
                      Accumulated number of INs cleaned.
                 nINsDead=7,571,954
                      Accumulated number of INs that were not found in the tree anymore (deleted).
                 nINsMigrated=464,511,076
                      Accumulated number of INs migrated.
                 nINsObsolete=1,300,924,876
                      Accumulated number of INs obsolete.
                 nLNQueueHits=1,246,324,115
                      Accumulated number of LNs processed without a tree lookup.
                 nLNsCleaned=2,001,852,660
                      Accumulated number of LNs cleaned.
                 nLNsDead=1,214
                      Accumulated number of LNs that were not found in the tree anymore (deleted).
                 nLNsLocked=14
                      Accumulated number of LNs encountered that were locked.
                 nLNsMarked=2,001,851,973
                      Accumulated number of LNs that were marked for migration during cleaning.
                 nLNsMigrated=2,001,838,059
                      Accumulated number of LNs that were marked for migration during cleaning.
                 nLNsObsolete=63,880,966
                      Accumulated number of LNs obsolete.
                 nMarkLNsProcessed=2,001,596,325
                      Accumulated number of LNs processed because they were previously marked for migration.
                 nPendingLNsLocked=12
                      Accumulated number of pending LNs that could not be locked for migration because of a long duration application lock.
                 nPendingLNsProcessed=26
                      Accumulated number of LNs processed because they were previously locked.
                 nRepeatIteratorReads=0
                      Number of attempts to read a log entry larger than the read buffer size during which the log buffer couldn't be grown enough to accommodate the object.
                 nToBeCleanedLNsProcessed=171,611
                      Accumulated number of LNs processed because they are soon to be cleaned.
                 totalLogSize=52,840,619,677
                      Approximation of the total log size in bytes.
            Node Compression: Removal and compression of internal btree nodes.
                 cursorsBins=605
                      Number of BINs encountered by the INComprssor that had cursors referring to them when the compresor ran.
                 dbClosedBins=0
                      Number of BINs encountered by the INCompressor that had their database closed between the time they were put on the compressor queue and when the compressor ran.
                 inCompQueueSize=0
                      Number of entries in the INCompressor queue when the getStats() call was made.
                 nonEmptyBins=0
                      Number of BINs encountered by the INCompressor that were not actually empty when the compressor ran.
                 processedBins=395,546
                      Number of BINs that were successfully processed by the INCompressor.
                 splitBins=6,236
                      Number of BINs encountered by the INCompressor that were split between the time they were put on the comprssor queue and when the compressor ran.
            Checkpoints: Frequency and extent of checkpointing activity.
                 lastCheckpointEnd=0xd391d/0x67ff11
                      Location in the log of the last checkpoint end.
                 lastCheckpointId=209,233
                      Id of the last checkpoint.
                 lastCheckpointStart=0xd3918/0x908c5
                      Location in the log of the last checkpont start.
                 nCheckpoints=15,237
                      Total number of checkpints run so far.
                 nDeltaINFlush=380,892,544
                      Accumulated number of Delta INs flushed to the log.
                 nFullBINFlush=168,122,414
                      Accumulated number of full BINs flushed to the log.
                 nFullINFlush=369,769,094
                      Accumulated number of full INs flushed to the log.
            Environment: General environment wide statistics.
                 btreeRelatchesRequired=11,081,066
                      Returns the number of btree latch upgrades required while operating on this Environment. A measurement of contention.
            Locks: Locks held by data operations, latching contention on lock table.
                 nLatchAcquireNoWaitUnsuccessful=0
                      Number of unsuccessful acquireNoWait() calls.
                 nLatchAcquiresNoWaitSuccessful=0
                      Number of times acquireNoWait() was called when the latch was successfully acquired.
                 nLatchAcquiresNoWaiters=0
                      Number of times the latch was acquired without contention.
                 nLatchAcquiresSelfOwned=0
                      Number of times the latch was acquired it was already owned by the caller.
                 nLatchAcquiresWithContention=0
                      Number of times the latch was acquired when it was already owned by another thread.
                 nLatchReleases=0
                      Number of latch releases.
                 nOwners=7
                      Number of lock owners in lock table.
                 nReadLocks=7
                      Number of read locks currently held.
                 nRequests=7,927,052,486
                      Number of times a lock request was made.
                 nTotalLocks=7
                      Number of locks current in lock table.
                 nWaiters=0
                      Number of transactions waiting for a lock.
                 nWaits=775,221
                      Number of times a lock request blocked.
                 nWriteLocks=0
                      Number of write locks currently held.
            • 3. Re: Crazy cleaner.
              742833
              Server 03 STATS
              Mon Feb 27 15:52:01 CET 2012
              Stats DB_AUDITING_BATCH
              I/O: Log file opens, fsyncs, reads, writes, cache misses.
                   bufferBytes=3,145,728
                        Total memory currently consumed by log buffers, in bytes.
                   endOfLog=0xe5459/0x137a0
                        The location of the next entry to be written to the log.
                   nBytesReadFromWriteQueue=0
                        Number of bytes read to fulfill file read operations by reading out of the pending write queue.
                   nBytesWrittenFromWriteQueue=1,069,159,423
                        Number of bytes written from the pending write queue.
                   nCacheMiss=1,811,883,381
                        Total number of requests for database objects which were not in memory.
                   nFSyncRequests=1,334,489
                        Number of fsyncs requested through the group commit manager for actions such as transaction commits and checkpoints.
                   nFSyncTimeouts=0
                        Number of fsyncs requests submitted to the group commit manager for actions such as transaction commmits and checkpoints which timed out.
                   nFSyncs=1,334,489
                        Number of fsyncs issued through the group commit manager for actions such as transaction commits and checkpoints. A subset of nLogFsyncs.
                   nFileOpens=858,034,717
                        Number of times a log file has been opened.
                   nLogBuffers=3
                        Number of log buffers currently instantiated.
                   nLogFSyncs=1,491,920
                        Total number of fsyncs of the JE log. This includes those fsyncs recorded under the nFsyncs stat
                   nNotResident=1,858,384,540
                        Number of request for database objects not contained within the in memory data structure.
                   nOpenFiles=100
                        Number of files currently open in the file cache.
                   nRandomReadBytes=5,280,665,634,500
                        Number of bytes read which required respositioning the disk head more than 1MB from the previous file position.
                   nRandomReads=2,417,361,042
                        Number of disk reads which required respositioning the disk head more than 1MB from the previous file position.
                   nRandomWriteBytes=1,344,913,823,518
                        Number of bytes written which required respositioning the disk head more than 1MB from the previous file position.
                   nRandomWrites=2,587,346
                        Number of disk writes which required respositioning the disk head by more than 1MB from the previous file position.
                   nReadsFromWriteQueue=0
                        Number of file read operations which were fulfilled by reading out of the pending write queue.
                   nRepeatFaultReads=157,696,487
                        Number of reads which had to be repeated when faulting in an object from disk because the read chunk size controlled by je.log.faultReadSize is too small.
                   nSequentialReadBytes=2,109,135,886,379
                        Number of bytes read which did not require respositioning the disk head more than 1MB from the previous file position.
                   nSequentialReads=527,317,811
                        Number of disk reads which did not require respositioning the disk head more than 1MB from the previous file position.
                   nSequentialWriteBytes=215,856,082,761
                        Number of bytes written which did not require respositioning the disk head more than 1MB from the previous file position.
                   nSequentialWrites=363,889
                        Number of disk writes which did not require respositioning the disk head by more than 1MB from the previous file position.
                   nTempBufferWrites=0
                        Number of writes which had to be completed using the temporary marshalling buffer because the fixed size log buffers specified by je.log.totalBufferBytes and je.log.numBuffers were not large enough.
                   nWriteQueueOverflow=20
                        Number of write operations which would overflow the Write Queue.
                   nWriteQueueOverflowFailures=0
                        Number of write operations which would overflow the Write Queue and could not be queued.
                   nWritesFromWriteQueue=6,096
                        Number of file write operations executed from the pending write queue.
              Cache: Current size, allocations, and eviction activity.
                   adminBytes=2,485,841
                        Number of bytes of JE cache used for log cleaning metadata and other administrative structure, in bytes.
                   avgBatchCACHEMODE=0
                        Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                   avgBatchCRITICAL=61
                        Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                   avgBatchDAEMON=57
                        Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                   avgBatchEVICTORTHREAD=41
                        Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                   avgBatchMANUAL=0
                        Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                   cacheTotalBytes=382,171,441
                        Total amount of JE cache in use, in bytes.
                   dataBytes=376,539,200
                        Amount of JE cache used for holding data, keys and internal Btree nodes, in bytes.
                   lockBytes=672
                        Number of bytes of JE cache used for holding locks and transactions, in bytes.
                   nBINsEvictedCACHEMODE=0
                        Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nBINsEvictedCRITICAL=7,863,081
                        Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nBINsEvictedDAEMON=269,527,417
                        Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nBINsEvictedEVICTORTHREAD=645,272,865
                        Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nBINsEvictedMANUAL=0
                        Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nBINsFetch=7,122,772,767
                        Number of BINs (bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                   nBINsFetchMiss=922,057,384
                        Number of BINs (bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                   nBINsStripped=269,273,325
                        The number of BINs for which the child LNs have been removed (stripped) and are no longer in the cache. BIN stripping is the most efficient form of eviction.
                   nBatchesCACHEMODE=0
                        Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                   nBatchesCRITICAL=40,809
                        Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                   nBatchesDAEMON=617,189
                        Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                   nBatchesEVICTORTHREAD=2,004,084
                        Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                   nBatchesMANUAL=10,025
                        Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                   nCachedBINs=110,492
                        Number of BINs (bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                   nCachedUpperINs=58,418
                        Number of upper INs (non-bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                   nEvictPasses=45,915,136
                        Number of eviction passes, an indicator of the eviction activity level.
                   nINCompactKey=73,563
                        Number of INs that use a compact key representation to minimize the key object representation overhead.
                   nINNoTarget=7,854
                        Number of INs that use a compact representation when none of its child nodes arein the cache.
                   nINSparseTarget=145,426
                        Number of INs that use a compact sparse array representation to point to child nodes in the cache.
                   nLNsFetch=6,514,138,635
                        Number of LNs (data records) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                   nLNsFetchMiss=152,765,426
                        Number of LNs (data records) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                   nNodesEvicted=1,768,012,592
                        Number of nodes selected and removed from the cache.
                   nNodesScanned=47,421,547,353
                        Number of nodes scanned in order to select the eviction set, an indicator of eviction overhead.
                   nNodesSelected=2,038,180,948
                        Number of nodes which pass the first criteria for eviction, an indicator of eviction efficiency. nNodesExplicitlyEvicted plus nBINsStripped will roughly equal nNodesSelected.  nNodesSelected will be somewhat larger than the sum because some selected nodes don't pass a final screening.
                   nRootNodesEvicted=160,736
                        Number of database root nodes evicted.
                   nSharedCacheEnvironments=6
                        Number of Environments sharing the cache.
                   nThreadUnavailable=476,106,613
                        Number of eviction tasks that were submitted to the background evictor pool, but were refused because all eviction threads were busy. The user may want to change the size of the evictor pool through the je.evictor.*threads properties.
                   nUpperINsEvictedCACHEMODE=0
                        Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nUpperINsEvictedCRITICAL=6,765,033
                        Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nUpperINsEvictedDAEMON=246,015,855
                        Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nUpperINsEvictedEVICTORTHREAD=593,170,452
                        Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nUpperINsEvictedMANUAL=0
                        Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                   nUpperINsFetch=9,040,056,083
                        Number of Upper INs (non-bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                   nUpperINsFetchMiss=845,438,951
                        Number of Upper INs (non-bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                   requiredEvictBytes=0
                        Number of bytes we need to evict in order to get under budget.
                   sharedCacheTotalBytes=536,528,039
                        Total amount of the shared JE cache in use, in bytes.
              Cleaning: Frequency and extent of log file cleaning activity.
                   cleanerBackLog=0
                        Number of files to be cleaned to reach the target utilization.
                   fileDeletionBacklog=0
                        Number of files that are ready to be deleted.
                   nCleanerDeletions=155,155
                        Number of cleaner file deletions this session.
                   nCleanerEntriesRead=3,773,433,389
                        Accumulated number of log entries read by the cleaner.
                   nCleanerRuns=3,386,090
                        Number of cleaner runs this session.
                   nClusterLNsProcessed=0
                        Accumulated number of LNs processed because they qualify for clustering.
                   nINsCleaned=470,188,903
                        Accumulated number of INs cleaned.
                   nINsDead=7,165,528
                        Accumulated number of INs that were not found in the tree anymore (deleted).
                   nINsMigrated=463,023,375
                        Accumulated number of INs migrated.
                   nINsObsolete=1,313,995,211
                        Accumulated number of INs obsolete.
                   nLNQueueHits=1,163,077,722
                        Accumulated number of LNs processed without a tree lookup.
                   nLNsCleaned=1,892,202,142
                        Accumulated number of LNs cleaned.
                   nLNsDead=1,852
                        Accumulated number of LNs that were not found in the tree anymore (deleted).
                   nLNsLocked=52
                        Accumulated number of LNs encountered that were locked.
                   nLNsMarked=1,892,201,349
                        Accumulated number of LNs that were marked for migration during cleaning.
                   nLNsMigrated=1,892,514,329
                        Accumulated number of LNs that were marked for migration during cleaning.
                   nLNsObsolete=91,518,548
                        Accumulated number of LNs obsolete.
                   nMarkLNsProcessed=1,892,046,718
                        Accumulated number of LNs processed because they were previously marked for migration.
                   nPendingLNsLocked=24
                        Accumulated number of pending LNs that could not be locked for migration because of a long duration application lock.
                   nPendingLNsProcessed=76
                        Accumulated number of LNs processed because they were previously locked.
                   nRepeatIteratorReads=0
                        Number of attempts to read a log entry larger than the read buffer size during which the log buffer couldn't be grown enough to accommodate the object.
                   nToBeCleanedLNsProcessed=419,244
                        Accumulated number of LNs processed because they are soon to be cleaned.
                   totalLogSize=79,491,112,456
                        Approximation of the total log size in bytes.
              Node Compression: Removal and compression of internal btree nodes.
                   cursorsBins=638
                        Number of BINs encountered by the INComprssor that had cursors referring to them when the compresor ran.
                   dbClosedBins=0
                        Number of BINs encountered by the INCompressor that had their database closed between the time they were put on the compressor queue and when the compressor ran.
                   inCompQueueSize=0
                        Number of entries in the INCompressor queue when the getStats() call was made.
                   nonEmptyBins=0
                        Number of BINs encountered by the INCompressor that were not actually empty when the compressor ran.
                   processedBins=595,218
                        Number of BINs that were successfully processed by the INCompressor.
                   splitBins=6,851
                        Number of BINs encountered by the INCompressor that were split between the time they were put on the comprssor queue and when the compressor ran.
              Checkpoints: Frequency and extent of checkpointing activity.
                   lastCheckpointEnd=0xe5457/0x168a2
                        Location in the log of the last checkpoint end.
                   lastCheckpointId=227,949
                        Id of the last checkpoint.
                   lastCheckpointStart=0xe5452/0x68753d
                        Location in the log of the last checkpont start.
                   nCheckpoints=23,593
                        Total number of checkpints run so far.
                   nDeltaINFlush=408,818,421
                        Accumulated number of Delta INs flushed to the log.
                   nFullBINFlush=164,969,053
                        Accumulated number of full BINs flushed to the log.
                   nFullINFlush=368,253,179
                        Accumulated number of full INs flushed to the log.
              Environment: General environment wide statistics.
                   btreeRelatchesRequired=21,373,308
                        Returns the number of btree latch upgrades required while operating on this Environment. A measurement of contention.
              Locks: Locks held by data operations, latching contention on lock table.
                   nLatchAcquireNoWaitUnsuccessful=0
                        Number of unsuccessful acquireNoWait() calls.
                   nLatchAcquiresNoWaitSuccessful=0
                        Number of times acquireNoWait() was called when the latch was successfully acquired.
                   nLatchAcquiresNoWaiters=0
                        Number of times the latch was acquired without contention.
                   nLatchAcquiresSelfOwned=0
                        Number of times the latch was acquired it was already owned by the caller.
                   nLatchAcquiresWithContention=0
                        Number of times the latch was acquired when it was already owned by another thread.
                   nLatchReleases=0
                        Number of latch releases.
                   nOwners=6
                        Number of lock owners in lock table.
                   nReadLocks=6
                        Number of read locks currently held.
                   nRequests=7,912,292,009
                        Number of times a lock request was made.
                   nTotalLocks=6
                        Number of locks current in lock table.
                   nWaiters=0
                        Number of transactions waiting for a lock.
                   nWaits=77,530
                        Number of times a lock request blocked.
                   nWriteLocks=0
                        Number of write locks currently held.
              • 4. Re: Crazy cleaner.
                742833
                h3. Server 04 STATS
                Mon Feb 27 15:52:14 CET 2012
                Stats DB_AUDITING_BATCH
                I/O: Log file opens, fsyncs, reads, writes, cache misses.
                     bufferBytes=3,145,728
                          Total memory currently consumed by log buffers, in bytes.
                     endOfLog=0xdefa6/0x413b2e
                          The location of the next entry to be written to the log.
                     nBytesReadFromWriteQueue=0
                          Number of bytes read to fulfill file read operations by reading out of the pending write queue.
                     nBytesWrittenFromWriteQueue=484,838,374
                          Number of bytes written from the pending write queue.
                     nCacheMiss=1,683,469,403
                          Total number of requests for database objects which were not in memory.
                     nFSyncRequests=1,323,886
                          Number of fsyncs requested through the group commit manager for actions such as transaction commits and checkpoints.
                     nFSyncTimeouts=0
                          Number of fsyncs requests submitted to the group commit manager for actions such as transaction commmits and checkpoints which timed out.
                     nFSyncs=1,323,886
                          Number of fsyncs issued through the group commit manager for actions such as transaction commits and checkpoints. A subset of nLogFsyncs.
                     nFileOpens=826,050,490
                          Number of times a log file has been opened.
                     nLogBuffers=3
                          Number of log buffers currently instantiated.
                     nLogFSyncs=1,454,976
                          Total number of fsyncs of the JE log. This includes those fsyncs recorded under the nFsyncs stat
                     nNotResident=1,699,547,893
                          Number of request for database objects not contained within the in memory data structure.
                     nOpenFiles=100
                          Number of files currently open in the file cache.
                     nRandomReadBytes=4,952,713,760,359
                          Number of bytes read which required respositioning the disk head more than 1MB from the previous file position.
                     nRandomReads=2,272,938,670
                          Number of disk reads which required respositioning the disk head more than 1MB from the previous file position.
                     nRandomWriteBytes=1,106,436,506,119
                          Number of bytes written which required respositioning the disk head more than 1MB from the previous file position.
                     nRandomWrites=2,307,119
                          Number of disk writes which required respositioning the disk head by more than 1MB from the previous file position.
                     nReadsFromWriteQueue=0
                          Number of file read operations which were fulfilled by reading out of the pending write queue.
                     nRepeatFaultReads=134,292,606
                          Number of reads which had to be repeated when faulting in an object from disk because the read chunk size controlled by je.log.faultReadSize is too small.
                     nSequentialReadBytes=1,824,718,286,374
                          Number of bytes read which did not require respositioning the disk head more than 1MB from the previous file position.
                     nSequentialReads=480,606,919
                          Number of disk reads which did not require respositioning the disk head more than 1MB from the previous file position.
                     nSequentialWriteBytes=191,523,080,198
                          Number of bytes written which did not require respositioning the disk head more than 1MB from the previous file position.
                     nSequentialWrites=361,465
                          Number of disk writes which did not require respositioning the disk head by more than 1MB from the previous file position.
                     nTempBufferWrites=0
                          Number of writes which had to be completed using the temporary marshalling buffer because the fixed size log buffers specified by je.log.totalBufferBytes and je.log.numBuffers were not large enough.
                     nWriteQueueOverflow=16
                          Number of write operations which would overflow the Write Queue.
                     nWriteQueueOverflowFailures=0
                          Number of write operations which would overflow the Write Queue and could not be queued.
                     nWritesFromWriteQueue=3,463
                          Number of file write operations executed from the pending write queue.
                Cache: Current size, allocations, and eviction activity.
                     adminBytes=1,651,995
                          Number of bytes of JE cache used for log cleaning metadata and other administrative structure, in bytes.
                     avgBatchCACHEMODE=0
                          Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                     avgBatchCRITICAL=70
                          Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                     avgBatchDAEMON=58
                          Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                     avgBatchEVICTORTHREAD=42
                          Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                     avgBatchMANUAL=0
                          Average units of work done by one eviction pass. Along with the number of  batch size, it serves as an indicator of what part of the system is doing eviction work.
                     cacheTotalBytes=467,488,395
                          Total amount of JE cache in use, in bytes.
                     dataBytes=462,690,000
                          Amount of JE cache used for holding data, keys and internal Btree nodes, in bytes.
                     lockBytes=672
                          Number of bytes of JE cache used for holding locks and transactions, in bytes.
                     nBINsEvictedCACHEMODE=0
                          Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nBINsEvictedCRITICAL=5,467,070
                          Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nBINsEvictedDAEMON=265,929,388
                          Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nBINsEvictedEVICTORTHREAD=592,285,739
                          Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nBINsEvictedMANUAL=0
                          Number of BINs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nBINsFetch=6,907,486,593
                          Number of BINs (bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                     nBINsFetchMiss=863,085,702
                          Number of BINs (bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                     nBINsStripped=270,120,849
                          The number of BINs for which the child LNs have been removed (stripped) and are no longer in the cache. BIN stripping is the most efficient form of eviction.
                     nBatchesCACHEMODE=0
                          Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                     nBatchesCRITICAL=15,959
                          Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                     nBatchesDAEMON=543,386
                          Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                     nBatchesEVICTORTHREAD=1,847,061
                          Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                     nBatchesMANUAL=42,460
                          Number of attempts to evict, by type of evictor. Along with average batch size, it serves as an indicator of what part of the system is doing eviction work.
                     nCachedBINs=108,845
                          Number of BINs (bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                     nCachedUpperINs=69,408
                          Number of upper INs (non-bottom internal nodes) in cache. The cache holds INs and BINS, so this indicates the proportion used by each type of node. When used on shared environment caches, will only be visible via StatConfig.setFast(false)
                     nEvictPasses=41,518,332
                          Number of eviction passes, an indicator of the eviction activity level.
                     nINCompactKey=63,598
                          Number of INs that use a compact key representation to minimize the key object representation overhead.
                     nINNoTarget=7,541
                          Number of INs that use a compact representation when none of its child nodes arein the cache.
                     nINSparseTarget=153,213
                          Number of INs that use a compact sparse array representation to point to child nodes in the cache.
                     nLNsFetch=6,167,763,923
                          Number of LNs (data records) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                     nLNsFetchMiss=73,377,405
                          Number of LNs (data records) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                     nNodesEvicted=1,660,987,900
                          Number of nodes selected and removed from the cache.
                     nNodesScanned=45,338,293,510
                          Number of nodes scanned in order to select the eviction set, an indicator of eviction overhead.
                     nNodesSelected=1,931,875,355
                          Number of nodes which pass the first criteria for eviction, an indicator of eviction efficiency. nNodesExplicitlyEvicted plus nBINsStripped will roughly equal nNodesSelected.  nNodesSelected will be somewhat larger than the sum because some selected nodes don't pass a final screening.
                     nRootNodesEvicted=83,630
                          Number of database root nodes evicted.
                     nSharedCacheEnvironments=6
                          Number of Environments sharing the cache.
                     nThreadUnavailable=492,142,472
                          Number of eviction tasks that were submitted to the background evictor pool, but were refused because all eviction threads were busy. The user may want to change the size of the evictor pool through the je.evictor.*threads properties.
                     nUpperINsEvictedCACHEMODE=0
                          Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nUpperINsEvictedCRITICAL=4,663,191
                          Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nUpperINsEvictedDAEMON=245,062,939
                          Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nUpperINsEvictedEVICTORTHREAD=548,339,957
                          Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nUpperINsEvictedMANUAL=0
                          Number of upper INs evicted from the cache, using the specified eviction source. As a subset of nNodesEvicted, it is an indicator of what eviction is targeting and the activity that is instigating eviction
                     nUpperINsFetch=8,539,296,451
                          Number of Upper INs (non-bottom internal nodes) requested by btree operations. Can be used to gauge cache hit/miss ratios.
                     nUpperINsFetchMiss=797,529,112
                          Number of Upper INs (non-bottom internal nodes) requested by btree operations that were not in cache. Can be used to gauge cache hit/miss ratios.
                     requiredEvictBytes=0
                          Number of bytes we need to evict in order to get under budget.
                     sharedCacheTotalBytes=536,825,368
                          Total amount of the shared JE cache in use, in bytes.
                Cleaning: Frequency and extent of log file cleaning activity.
                     cleanerBackLog=0
                          Number of files to be cleaned to reach the target utilization.
                     fileDeletionBacklog=0
                          Number of files that are ready to be deleted.
                     nCleanerDeletions=132,365
                          Number of cleaner file deletions this session.
                     nCleanerEntriesRead=3,633,554,813
                          Accumulated number of log entries read by the cleaner.
                     nCleanerRuns=3,452,927
                          Number of cleaner runs this session.
                     nClusterLNsProcessed=0
                          Accumulated number of LNs processed because they qualify for clustering.
                     nINsCleaned=437,809,519
                          Accumulated number of INs cleaned.
                     nINsDead=7,204,485
                          Accumulated number of INs that were not found in the tree anymore (deleted).
                     nINsMigrated=430,605,034
                          Accumulated number of INs migrated.
                     nINsObsolete=1,240,767,192
                          Accumulated number of INs obsolete.
                     nLNQueueHits=1,159,659,086
                          Accumulated number of LNs processed without a tree lookup.
                     nLNsCleaned=1,884,878,564
                          Accumulated number of LNs cleaned.
                     nLNsDead=1,173
                          Accumulated number of LNs that were not found in the tree anymore (deleted).
                     nLNsLocked=20
                          Accumulated number of LNs encountered that were locked.
                     nLNsMarked=1,884,878,242
                          Accumulated number of LNs that were marked for migration during cleaning.
                     nLNsMigrated=1,885,183,463
                          Accumulated number of LNs that were marked for migration during cleaning.
                     nLNsObsolete=64,619,747
                          Accumulated number of LNs obsolete.
                     nMarkLNsProcessed=1,884,600,040
                          Accumulated number of LNs processed because they were previously marked for migration.
                     nPendingLNsLocked=41
                          Accumulated number of pending LNs that could not be locked for migration because of a long duration application lock.
                     nPendingLNsProcessed=61
                          Accumulated number of LNs processed because they were previously locked.
                     nRepeatIteratorReads=0
                          Number of attempts to read a log entry larger than the read buffer size during which the log buffer couldn't be grown enough to accommodate the object.
                     nToBeCleanedLNsProcessed=524,426
                          Accumulated number of LNs processed because they are soon to be cleaned.
                     totalLogSize=45,376,780,174
                          Approximation of the total log size in bytes.
                Node Compression: Removal and compression of internal btree nodes.
                     cursorsBins=570
                          Number of BINs encountered by the INComprssor that had cursors referring to them when the compresor ran.
                     dbClosedBins=0
                          Number of BINs encountered by the INCompressor that had their database closed between the time they were put on the compressor queue and when the compressor ran.
                     inCompQueueSize=0
                          Number of entries in the INCompressor queue when the getStats() call was made.
                     nonEmptyBins=0
                          Number of BINs encountered by the INCompressor that were not actually empty when the compressor ran.
                     processedBins=352,627
                          Number of BINs that were successfully processed by the INCompressor.
                     splitBins=5,195
                          Number of BINs encountered by the INCompressor that were split between the time they were put on the comprssor queue and when the compressor ran.
                Checkpoints: Frequency and extent of checkpointing activity.
                     lastCheckpointEnd=0xdefa4/0x471f7b
                          Location in the log of the last checkpoint end.
                     lastCheckpointId=217,540
                          Id of the last checkpoint.
                     lastCheckpointStart=0xdef9f/0x4b7e5d
                          Location in the log of the last checkpont start.
                     nCheckpoints=12,939
                          Total number of checkpints run so far.
                     nDeltaINFlush=450,406,895
                          Accumulated number of Delta INs flushed to the log.
                     nFullBINFlush=166,108,909
                          Accumulated number of full BINs flushed to the log.
                     nFullINFlush=364,925,825
                          Accumulated number of full INs flushed to the log.
                Environment: General environment wide statistics.
                     btreeRelatchesRequired=6,907,280
                          Returns the number of btree latch upgrades required while operating on this Environment. A measurement of contention.
                Locks: Locks held by data operations, latching contention on lock table.
                     nLatchAcquireNoWaitUnsuccessful=0
                          Number of unsuccessful acquireNoWait() calls.
                     nLatchAcquiresNoWaitSuccessful=0
                          Number of times acquireNoWait() was called when the latch was successfully acquired.
                     nLatchAcquiresNoWaiters=0
                          Number of times the latch was acquired without contention.
                     nLatchAcquiresSelfOwned=0
                          Number of times the latch was acquired it was already owned by the caller.
                     nLatchAcquiresWithContention=0
                          Number of times the latch was acquired when it was already owned by another thread.
                     nLatchReleases=0
                          Number of latch releases.
                     nOwners=6
                          Number of lock owners in lock table.
                     nReadLocks=6
                          Number of read locks currently held.
                     nRequests=7,689,870,612
                          Number of times a lock request was made.
                     nTotalLocks=6
                          Number of locks current in lock table.
                     nWaiters=0
                          Number of transactions waiting for a lock.
                     nWaits=36,615
                          Number of times a lock request blocked.
                     nWriteLocks=0
                          Number of write locks currently held.
                • 5. Re: Crazy cleaner.
                  Linda Lee-Oracle
                  Cesar,

                  It certainly seems that your JE HA system has become unbalanced, and that server 01 is lagging behind. There are two pieces of information that jump out:

                  - server 01 does seems to be running with a much smaller cache. We can also see from the cleaning backlog that the cleaner is having trouble keeping up. That's the cleaner backlog stat, which says that the system cannot keep up with the amount of underutilized log that is being created. An overly small cache setting can lead to inefficient cleaning, so it seems likely that the small cache is creating the cleaning problem.

                  - These messages: "Cleaner has 3 files not deleted because of read-only processes" seem to be saying that there is a read only process that is inhibiting log file deletion. This read only process seems to be intermittent, because the file deletion backlog value goes up and down, but if you do not expect there to be a read only process using this environment, this is certainly unexpected.

                  All the JE instances have the same max memory setting, but are using a shared cache. Could the different environment on the server 01 have gotten out of sync, so one environment is getting starved for cache?

                  In summary, my guess is that the cleaning problem is due to some kind of cache shortage, and between inefficient cleaning and cache misses, the server 01 environment is struggling. A starting point is to look for something like a read only process or shared environment that is somehow depriving that replication node of memory.

                  Linda

                  Linda
                  • 6. Re: Crazy cleaner.
                    742833
                    Hi Linda

                    After a rolling restart the problem seems to be fixed but I don't understand why :-(
                    Server: 01
                       Total Usage reported by DbSpace: 47%
                       Number of files used at 0%.....: 8
                       Number of filles < 15% usage...: 113
                       cacheTotalBytes................: 480,880,925
                       cleanerBackLog.................: 0
                    Server: 02
                       Total Usage reported by DbSpace: 45%
                       Number of files used at 0%.....: 0
                       Number of filles < 15% usage...: 320
                       cacheTotalBytes................: 487,609,211
                       cleanerBackLog.................: 0
                    Server: 03
                       Total Usage reported by DbSpace: 46%
                       Number of files used at 0%.....: 7
                       Number of filles < 15% usage...: 110
                       cacheTotalBytes................: 487,895,834
                       cleanerBackLog.................: 0
                    Server: 04
                       Total Usage reported by DbSpace: 47%
                       Number of files used at 0%.....: 5
                       Number of filles < 15% usage...: 112
                       cacheTotalBytes................: 472,006,811
                       cleanerBackLog.................: 0
                    In fact, reviewing older statistics we have found the same pattern but with Server03 instead of Server01, so I'm reviewing our internal stats to determine if requests are being forwarded equally to all the replicas.

                    Anyway, the "Total Usage" is close the 50% (je.cleaner.minUtilization=50) but:
                    A) is it right to have files used at 0%?
                    B) is it right to have files bellow the 15% (je.cleaner.minFileUtilization=15)?
                    C) which is the connection between "cacheTotalBytes" and cleaner? I mean, if for some reason one replica does not perform any read operation but it is only used as a backup mechanism, does it mean that cleaner will not work properly?

                    Finally in reference to the messages "Cleaner has 3 files not deleted because of read-only processes" and the "read only process that is inhibiting log file deletion", inform that it only appears when a DbSpace is executed.

                    BR,
                    /César.

                    Edited by: Cesar Alvarez on 29-feb-2012 18:15
                    • 7. Re: Crazy cleaner.
                      Greybird-Oracle
                      Hi Cesar,

                      >
                      Anyway, the "Total Usage" is close the 50% (je.cleaner.minUtilization=50) but:
                      A) is it right to have files used at 0%?
                      >

                      Yes. The 0% files are probably just waiting to be deleted. Cleaning is very asynchronous and requires one checkpoint (sometimes more) before deleting a file.

                      >
                      B) is it right to have files bellow the 15% (je.cleaner.minFileUtilization=15)?
                      >

                      In general there should not be large numbers of files in this category. But in general I belive this setting works. It may be that these files are in the middle of being cleaned. You may want to watch this over time, but I don't think it is a big cause for concern.

                      >
                      C) which is the connection between "cacheTotalBytes" and cleaner? I mean, if for some reason one replica does not perform any read operation but it is only used as a backup mechanism, does it mean that cleaner will not work properly?
                      >

                      What we know for sure is that when there is not enough cache to hold the BINs, cleaning becomes inefficient and even counterproductive in the extreme case. It is very difficult to know (without much debugging when it happens) whether there is a out-of-balance condition among rep nodes that causes something similar to happen. We have not seen this, and I don't believe it would be caused by the lack of read operations.

                      >
                      Finally in reference to the messages "Cleaner has 3 files not deleted because of read-only processes" and the "read only process that is inhibiting log file deletion", inform that it only appears when a DbSpace is executed.
                      >

                      Ok, thanks.

                      --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
                      • 8. Re: Crazy cleaner.
                        742833
                        Hi Mark,
                        What we know for sure is that when there is not enough cache to hold the BINs, cleaning becomes inefficient and even counterproductive in the extreme case. It is very difficult to know (without much debugging when it happens) whether there is a out-of-balance condition among rep nodes that causes something similar to happen. We have not seen this, and I don't believe it would be caused by the lack of read operations.
                        It means that replication stream does not warm up the cache on replica?

                        The problematic replicated environment is used to store "audit traces". It means that it has too many writes (to the master) and few reads (equally balanced to the replicas); an usually the query refers to the last added entries.

                        So in case of a restart, if replication stream does not warm up the cache, it would be needed to perform some kind of read operations to warm up the cache?

                        BR,
                        /César.
                        • 9. Re: Crazy cleaner.
                          Greybird-Oracle
                          It means that replication stream does not warm up the cache on replica?
                          What I tried to say is no, we do not know of a problem of that nature. The replication stream should warm up the cache. In other words, we don't know what caused the problem you experienced.

                          --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
                          • 10. Re: Crazy cleaner.
                            742833
                            Hi Mark,

                            Once this point has been clarified, do you have some recipe about which loggers could be enabled in order to perform some debugging?
                            I would like to see more debug information about cache usage but I don't want to enable debug log for everything.

                            Thanks,
                            /César.
                            • 11. Re: Crazy cleaner.
                              Greybird-Oracle
                              No, I don't know of any additional logging that could be used to debug this. Capturing the JE stats for each environment, to see what changes during the period that the problem occurs, is what I suggest.

                              Linda may have more input on this, but she won't be back from vacation until tomorrow.

                              --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
                              • 12. Re: Crazy cleaner.
                                742833
                                Ok, we are working on a way to periodically recover and store stats (for further analysis) in addition to trigger some alarms in case of unexpected values.

                                I'll keep you informed.

                                Thanks,
                                /César.
                                • 13. Re: Crazy cleaner.
                                  Linda Lee-Oracle
                                  Cesar,

                                  I took a minute to review the trace logging we have in the JE cache
                                  management layer, but in the end, I do agree with Mark that the
                                  existing debug logging in those classes is not too helpful in this
                                  case. As he suggests, collecting and monitoring the JE environment
                                  stats is probably the best option.

                                  It's hard to know at this time what the initial problem is. Since the
                                  cache total bytes value is the stat that seems to have the strangest value, it might be best to use that as a trigger for an alert. Some of the questions that come to my mind are:

                                  - Why is the cache total bytes number so low for the problem node? What's the history of cache total bytes before the problem starts?

                                  - Does the low cache total bytes value cause the cleanerBackLog value to go up? (That's our current guess.) Or are there any cases where cleanerBackLog goes up without the cache total bytes falling?

                                  - Does the nCacheMiss value have any correlation to the cacheTotalBytes value?

                                  - Is there a change in the number or type of user operations executed by this node that is correlated with this problem?

                                  - There are 6 environments that are sharing this cache. Is one of the environments starting to grab the bulk of the available memory and CPU on the node?

                                  - Is the slow node always a master or a replica when the problem occurs?


                                  In addition, it may be useful to collect the JE replicated environment
                                  stats (com.sleepycat.je.rep.ReplicatedEnvironmentStats). However, if
                                  you take a look at that class, you'll see that it mainly focuses on
                                  the traffic there is between masters and replicas. If the problem is
                                  independent of replication, those ReplicatedEnvironmentStats value may
                                  start going up just as a secondary effect, because a replica is
                                  slowing down and falling behind.

                                  Linda
                                  • 14. Re: Crazy cleaner.
                                    742833
                                    Hi Linda,
                                    - Does the low cache total bytes value cause the cleanerBackLog value to go up? (That's our current guess.) Or are there any cases where cleanerBackLog goes up without the cache total bytes falling?
                                    We've found both scenarios; but in the last one ('cache total bytes' high and 'cleanerBackLog' high) the problem point at a deadlock between external DbSpace execution and the runtime env; since the former prevents log file deletion which at the same prevents DbSpace execution end. It was included on a cron as a "quick and dirty" mechanism to detect low total usage, so this is not the original cause. It has been removed ;-)

                                    But in previous scenarios there was not any DbSpace
                                    - Is there a change in the number or type of user operations executed by this node that is correlated with this problem?
                                    The requests seems to be equally balanced among all the servers; but we need to perform a depth analysis.
                                    - There are 6 environments that are sharing this cache. Is one of the environments starting to grab the bulk of the available memory and CPU on the node?
                                    Yes, the cacheTotalBytes/sharedCacheTotalBytes per environment change through the day; because the activity associated to each env is variable.
                                    - Is the slow node always a master or a replica when the problem occurs?
                                    We've found both situations.
                                    In addition, it may be useful to collect the JE replicated environment
                                    stats (com.sleepycat.je.rep.ReplicatedEnvironmentStats). However, if
                                    you take a look at that class, you'll see that it mainly focuses on
                                    the traffic there is between masters and replicas. If the problem is
                                    independent of replication, those ReplicatedEnvironmentStats value may
                                    start going up just as a secondary effect, because a replica is
                                    slowing down and falling behind.>
                                    Perfect, we'll add it.

                                    The remaining question could not be answered until we'll deploy the "stats monitoring and recollection" at production and we can perform some in depth analysis :-(

                                    Thanks,
                                    /César.