2 Replies Latest reply on May 21, 2012 3:52 PM by 938638

    Upgraded to 5.0.34, starting getting weird exceptions

      Hello guys,

      We have updated our server from 4.1.10 to 5.0.34 and we started getting:

      com.sleepycat.je.EnvironmentFailureException: (JE 5.0.34) Cannot log LNs in read-only env. UNEXPECTED_STATE: Unexpected internal state, may have side effects.
      at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:376)
      at com.sleepycat.je.tree.LN.logInternal(LN.java:450)
      at com.sleepycat.je.tree.LN.optionalLog(LN.java:350)
      at com.sleepycat.je.dbi.CursorImpl.putCurrentAlreadyLatchedAndLocked(CursorImpl.java:1236)
      at com.sleepycat.je.dbi.CursorImpl.putCurrent(CursorImpl.java:1132)
      at com.sleepycat.je.dbi.DbTree$RewriteMapLN.doWork(DbTree.java:744)
      at com.sleepycat.je.tree.Tree.withRootLatchedExclusive(Tree.java:386)
      at com.sleepycat.je.dbi.DbTree.modifyDbRoot(DbTree.java:713)
      at com.sleepycat.je.dbi.DbTree.modifyDbRoot(DbTree.java:646)
      at com.sleepycat.je.dbi.DbTree.optionalModifyDbRoot(DbTree.java:636)
      at com.sleepycat.je.tree.Tree.searchSplitsAllowed(Tree.java:1217)
      at com.sleepycat.je.tree.Tree.getParentBINForChildLN(Tree.java:817)
      at com.sleepycat.je.recovery.RecoveryManager.redo(RecoveryManager.java:1911)
      at com.sleepycat.je.recovery.RecoveryManager.redoOneLN(RecoveryManager.java:1502)
      at com.sleepycat.je.recovery.RecoveryManager.redoLNs(RecoveryManager.java:1302)
      at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:586)
      at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:188)
      at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:604)
      at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:210)
      at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:246)
      at com.sleepycat.je.Environment.<init>(Environment.java:227)
      at com.sleepycat.je.Environment.<init>(Environment.java:170)
      at com.fredhopper.itemstore.BDBItemStoreManager.getNewEnvironment(BDBItemStoreManager.java:325)
      at com.fredhopper.itemstore.BDBItemStoreManager.getNewItemDatabase(BDBItemStoreManager.java:217)
      at com.fredhopper.itemstore.BDBItemStoreManager.openDB(BDBItemStoreManager.java:182)
      at com.fredhopper.itemstore.ItemStoreManager.reopenNoLockNoChecks(ItemStoreManager.java:472)
      at com.fredhopper.itemstore.ItemStoreManager.reopen(ItemStoreManager.java:438)
      at com.fredhopper.index.primitive.reader.FASItemsIndexReader.refresh(FASItemsIndexReader.java:334)
      at com.fredhopper.index.primitive.reader.descriptors.ItemsIndexBatchDescriptorImpl.refresh(ItemsIndexBatchDescriptorImpl.java:60)
      at com.fredhopper.index.primitive.writer.FASSearchIndexWriter.generateAndStoreContributionForIndex(FASSearchIndexWriter.java:488)
      at com.fredhopper.index.composite.FASCompositeIndexImpl.contributeIndexUpdate(FASCompositeIndexImpl.java:421)
      at com.fredhopper.application.SearchIndexApp.processSearchIndex(SearchIndexApp.java:348)
      at com.fredhopper.application.SearchIndexApp.main(SearchIndexApp.java:156)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:601)
      at com.fredhopper.management.simple.RunHelper$2.run(RunHelper.java:341)
      at java.lang.Thread.run(Thread.java:722)

      We do checkpointing explicitly, once in a while. As far as I understand from the source code and the stack trace,
      BDB loaded successfully, then went on to rebuild the internal tree, and since we had not built and saved a checkpoint in the
      last update, it started rebuilding the missing pieces. While doing so, on a read-only environment, it hit logInternal() at code :
      if (envImpl.isReadOnly()) {
      /* Returning a NULL_LSN will not allow locking. */
      throw EnvironmentFailureException.unexpectedState
      ("Cannot log LNs in read-only env.");

      , which to me looks like an 'assertion' style error that should never happen. Is this a bug or API change, for which we have to adapt when upgrading?

      Some statitics on the observations:
      - we get this in 1% of all the test we do, various data.
      - we get it mostly reproducible in 1 of the data sets we process into BDB.
      - it used to work for 4-5 years with previous versions before this particular upgrade.

      We may have to already patch our product to work, so we though:
      - a) downgrading back to 4
      - b) fixing our indexer so that it produces the database updates to always do checkpointing or
      leave checkpointing to the internal threads again (we disabled for control when
      to use CPU, as we know better when we have time to do checkpointing).

      Any comments how to proceed?

        • 1. Re: Upgraded to 5.0.34, starting getting weird exceptions
          Hello Nik,
          which to me looks like an 'assertion' style error that should never happen. Is this a bug or API change, for which we have to adapt when upgrading?
          Yes, it is an assertion of sorts, and should never fire. It's a problem with a read-only environment and is a side effect of other changes (i.e., a bug) in JE 5. Fortunately, the fix is likely to be very simple, so I propose that we work together on that, rather than the other alternatives you listed.

          Could you please email me about this? mark.hayes at o.com (o == oracle).

          • 2. Re: Upgraded to 5.0.34, starting getting weird exceptions
            Hi Mark,

            Thanks, I have sent you a direct email.
            We can report here back again when we have sorted out the issue, for reference.