8 Replies Latest reply: Jul 27, 2009 9:37 AM by 577765 RSS

    NPE in RecoveryManager

    577765
      Hi,

      On startup of my application, I'm often experiencing the following exception:

      com.sleepycat.je.DatabaseException: (JE 3.3.82) last LSN=0x0/0x49447:
      java.lang.NullPointerException
      at com.sleepycat.je.recovery.RecoveryManager.replaceOrInsertDuplicateRoot(RecoveryManager.java:1664)
      at com.sleepycat.je.recovery.RecoveryManager.replaceOrInsert(RecoveryManager.java:1384)
      at com.sleepycat.je.recovery.RecoveryManager.replayOneIN(RecoveryManager.java:934)

      At first, the application gets an out of memory exception so that I can't stop it regularly and I need to kill it. After I restart the application, it can't recover until I restore the database from backup.

      Thanks,
      Slava
        • 1. Re: NPE in RecoveryManager
          Greybird-Oracle
          Slava,

          Apparently your log was corrupted when the process was killed. Normally JE recovery will back up to a point in the log prior to the corruption, but it seems that in this case it was not able to do so.

          Restoring from backup is the safest thing to do.

          Another possibility is to move aside the last (highest numbered) log file and then try opening the environment. You will lose the data in the last file, but that may be better than restoring from a backup -- it depends on whether you want to restore to a known point (the backup) or recover as much data as possible. This may or may not work, it depends on whether the corruption is limited to the last log file.

          Yet another possibility is to run DbDump with the -r or -R option. This will output your database records in dump format, and they can be reloaded with DbLoad. This may allow you to recover more data, possibly including some information in the last log file. For details see:
          Re: BufferUnderflowException using DbDump

          We have not seen this problem before. Please save your log files and send me email -- mark.hayes @ o.com. If possible, I would like to examine your log files and try to change JE to handle the type of corruption that occurred.

          Please also post the original exception that occurred. Do you know why you ran out of memory?

          --mark

          Corrected my email address above -- it's mark.hayes @ o.com.
          • 2. Re: NPE in RecoveryManager
            Greybird-Oracle
            My email address was incorrect above -- it's mark.hayes @ o.com.
            • 3. Re: NPE in RecoveryManager
              Linda Lee-Oracle
              Slava,

              Can you also tell us if you have set

              je.compressor.purgeRoot = true

              And in general, would you share your JE configuration settings?

              Thanks,

              Linda
              • 4. Re: NPE in RecoveryManager
                577765
                Linda,

                Here are some custom settings:

                je.log.checksumRead = false
                je.lock.nLockTables = 23
                je.txn.dumpLocks = true
                je.evictor.lruOnly = true
                je.compressor.purgeRoot = true
                je.cleaner.minAge = 4
                je.cleaner.clusterAll = true
                je.log.numBuffers = 4
                je.log.totalBufferBytes = 2097152
                je.cleaner.lookAheadCacheSize = 262144
                je.log.iteratorReadSize = 262144

                Edited by: penemue on Jul 24, 2009 8:27 AM
                • 5. Re: NPE in RecoveryManager
                  577765
                  Mark, I sent email to you.
                  • 6. Re: NPE in RecoveryManager
                    Greybird-Oracle
                    Slava,

                    This is very likely due to a bug that is brought out by:

                    je.compressor.purgeRoot = true

                    Please remove this option. I apologize for the bug. We may decide to disable this option completely, since the benefit is small and it is (obviously) not as well tested as it should be. I'll follow up and post the resolution here.

                    Thanks,
                    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
                    • 7. Re: NPE in RecoveryManager
                      Linda Lee-Oracle
                      To be very precise, the situation that Mark is talking about is when:
                      - all the records in a database are deleted
                      - the database itself is not deleted
                      - the je.compressor.purgeRoot flag is set to true
                      - on top of that, you need a timing dependent set of operations to occur

                      Our apologies,

                      Linda
                      • 8. Re: NPE in RecoveryManager
                        577765
                        Linda and Mark,

                        Thank you, I removed usage of the je.compressor.purgeRoot setting.

                        Best regards,
                        Slava