2 Replies Latest reply: Jan 9, 2009 10:40 AM by greybird RSS

    3.3.74: NullPointerException in RecoveryManager

    649984
      Hi all,

      The following occurred twice in the past week:

      2009-01-07 09:39:50,224 FATAL [collection.Slave] Failed to initialise the local datastore: Could not open BDB: (JE 3.3.74) last LSN=0x13d6/0xa5b642
      com.arantech.assure.collection.data.DBInitialisationException: Could not open BDB: (JE 3.3.74) last LSN=0x13d6/0xa5b642
      at com.arantech.assure.collection.data.DatabaseBuilder.createBDB(DatabaseBuilder.java:66)
      at com.arantech.assure.collection.data.DatabaseBuilder.<init>(DatabaseBuilder.java:30)
      at com.arantech.assure.collection.data.AwakenDatabaseBuilder.<init>(AwakenDatabaseBuilder.java:57)
      at com.arantech.assure.collection.data.DatabaseBuilder.createBuilder(DatabaseBuilder.java:98)
      at com.arantech.assure.collection.data.DatabaseBuilder.getBuilder(DatabaseBuilder.java:104)
      at com.arantech.assure.collection.Slave$DatabaseStartupThread.run(Slave.java:172)
      Caused by: com.sleepycat.je.DatabaseException: (JE 3.3.74) last LSN=0x13d6/0xa5b642
      at com.sleepycat.je.recovery.RecoveryManager.traceAndThrowException(RecoveryManager.java:2520)
      at com.sleepycat.je.recovery.RecoveryManager.redoLNs(RecoveryManager.java:1263)
      at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:523)
      at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:158)
      at com.sleepycat.je.dbi.EnvironmentImpl.<init>(EnvironmentImpl.java:389)
      at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:147)
      at com.sleepycat.je.Environment.<init>(Environment.java:210)
      at com.sleepycat.je.Environment.<init>(Environment.java:150)
      at com.sleepycat.je.XAEnvironment.<init>(XAEnvironment.java:40)
      at com.arantech.assure.collection.data.DatabaseBuilder.createBDB(DatabaseBuilder.java:62)
      ... 5 more
      Caused by: java.lang.NullPointerException: lsn1=21814149475543 lsn2=-1
      at com.sleepycat.je.utilint.DbLsn.compareTo(DbLsn.java:76)
      at com.sleepycat.je.cleaner.RecoveryUtilizationTracker.isDbUncounted(RecoveryUtilizationTracker.java:151)
      at com.sleepycat.je.cleaner.RecoveryUtilizationTracker.countObsoleteIfUncounted(RecoveryUtilizationTracker.java:107)
      at com.sleepycat.je.recovery.RecoveryManager.redoUtilizationInfo(RecoveryManager.java:2322)
      at com.sleepycat.je.recovery.RecoveryManager.redoLNs(RecoveryManager.java:1247)
      ... 13 more
      2009-01-07 09:39:50,294 WARN [ring.MessageManager] Existing Statistics Service still running.

      As you can see, this is with 3.3.74. A quick diff seems to indicate that this code didn't change in 3.3.75, so I guess upgrading wouldn't fix the issue. Furthermore we never saw this problem with 3.2.76, and the RecoveryManager class changed quite a bit since then.

      It looks to me that the if at RecoveryManager.redoLNs:1156 goes through, redoUtilizationInfo will be called with commitLsn remaining set to NULL, which should not happen, as per the assert at redoUtilizationInfo:2321.

      I found a few NPEs during recovery on the forums (http://kr.forums.oracle.com/forums/thread.jspa?threadID=621705, http://kr.forums.oracle.com/forums/thread.jspa?messageID=2794571), but they have been fixed.

      Any idea?

      Cheers,

      Matthieu Bentot
        • 1. Re: 3.3.74: NullPointerException in RecoveryManager
          greybird
          Hi Matthieu,

          I don't have an answer for you yet, but I wanted to let you know that we're looking into this and will get back to you in the next day or two.

          --mark                                                                                                                                                                                                                                                                                                                                               
          • 2. Re: 3.3.74: NullPointerException in RecoveryManager
            greybird
            Matthieu,

            I sent you email about this but haven't heard back from you. This is a bug that has been fixed in our working ('next') version of JE, and we've back-ported the fix to JE 3.3.77. Please try the jar I sent you and let me know how it goes.

            Thanks,
            --mark