0 Replies Latest reply on Aug 15, 2017 2:55 PM by mayo

    Bug fix in BDB JE6.4.15 doesn't work

    mayo

      Hi all,

           We  are using JE 6.4.9 in our production environment. Here is the information about how our application use JE:

           1.  we use ReplicatedEnvironment with two node(HA mode)

           2.  we  use DPL to access and manage data in DB

           3.  we use "designatePrimary" when one node is down to insure data can be accessed (We have two type of DB , including dataDB and metaDB. In "designatePrimary" scenario, wo don't do any write operation to dataDB and we just read the data in dateDB out whlile do write and           read  access to metaDB)

           4.  wo set je.env.runCleaner to false , and customerize the log file clean by replicatedEnvironment.cleanLogFile() .

       

           Our application run well  for a few monthes before the master node met a network problem which is master can not be connected and can not connect to other machine. This last about for about 5 minutes and within this period the replica "desigantePrimary" ,and did some write operation to metaDB and read operation to dataDB. When the network is recover , the old master node cannot be re-join the replicated group. The log shows like:

      com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 6.4.9)logic1-10.254.82.166(-1):/stage/logic1 fetchLN of 0xc1f3/0x28d12f7c parent IN=26902723 IN class="com".sleepycat.je.tree.BIN lastFullLsn=0xc350/0x2d3618b0 lastLoggedLsn=0xc350/0x2d3618b0 parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2848)
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2746)
              at com.sleepycat.je.dbi.CursorImpl.getCurrent(CursorImpl.java:2235)
              at com.sleepycat.je.dbi.CursorImpl.lockAndGetCurrent(CursorImpl.java:2083)
              at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3826)
              at com.sleepycat.je.cleaner.UtilizationProfile.removePerDbMetadata(UtilizationProfile.java:418)
              at com.sleepycat.je.cleaner.UtilizationProfile.populateCache(UtilizationProfile.java:845)
              at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:441)
              at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:717)
              at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:254)
              at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:287)
              at com.sleepycat.je.Environment.<init>(Environment.java:268)
              at com.sleepycat.je.Environment.<init>(Environment.java:224)
              at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:629)
              at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:489)
              at service.impl.DefStoreManagerService.getEnvironment(DefStoreManagerService.java:617)
      Caused by: java.io.FileNotFoundException: /stage/logic1/0000c1f3.jdb (No such file or directory)
              at java.io.RandomAccessFile.open(Native Method)
              at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
              at java.io.RandomAccessFile.<init>(RandomAccessFile.java:122)
              at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>(FileManager.java:3226)
              at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3254)
              at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1333)
              at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1204)
              at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1136)
              at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:823)
              at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:788)
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2808)
              ... 18 more
      

       

      and I found this problem metioned in the changeLog of JE6.4.15 :

      The bug does not cause a permanent data corruption, if upgrading to JE 6.4.14 is possible. In other words, if the problem occurs in an earlier version, upgrading to 6.4.14 or later will allow the Environment to be opened.

      So I upgrade my JE dependency from 6.4.9 to 6.4.25, but it doesn't work:

      com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 6.4.25)logic1-10.254.82.166(-1):/stage/logic1 fetchLN of 0xc1f3/0x28d12f7c parent IN=26902723 IN class="com".sleepycat.je.tree.BIN lastFullLsn=0xc350/0x2d3618b0 lastLoggedLsn=0xc350/0x2d3618b0 parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2848)
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2746)
              at com.sleepycat.je.dbi.CursorImpl.getCurrent(CursorImpl.java:2235)
              at com.sleepycat.je.dbi.CursorImpl.lockAndGetCurrent(CursorImpl.java:2083)
              at com.sleepycat.je.dbi.CursorImpl.traverseDbWithCursor(CursorImpl.java:3826)
              at com.sleepycat.je.cleaner.UtilizationProfile.removePerDbMetadata(UtilizationProfile.java:418)
              at com.sleepycat.je.cleaner.UtilizationProfile.populateCache(UtilizationProfile.java:845)
              at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:441)
              at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:717)
              at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:254)
              at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:287)
              at com.sleepycat.je.Environment.<init>(Environment.java:268)
              at com.sleepycat.je.Environment.<init>(Environment.java:224)
              at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:629)
              at com.sleepycat.je.rep.ReplicatedEnvironment.<init>(ReplicatedEnvironment.java:489)
              at service.impl.DefStoreManagerService.getEnvironment(DefStoreManagerService.java:617)
      Caused by: java.io.FileNotFoundException: /stage/logic1/0000c1f3.jdb (No such file or directory)
              at java.io.RandomAccessFile.open(Native Method)
              at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
              at java.io.RandomAccessFile.<init>(RandomAccessFile.java:122)
              at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>(FileManager.java:3226)
              at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3254)
              at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1333)
              at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1204)
              at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1136)
              at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:823)
              at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:788)
              at com.sleepycat.je.tree.IN.fetchLN(IN.java:2808)
              ... 18 more
      

      Is there any suggestions about what I should do next ?