4 Replies Latest reply on May 4, 2012 4:39 PM by Greybird-Oracle

    SecondaryIntegrityException Causes?

    932687
      I have written an application which uses Berkeley DB JE 5.0.34 for persistence. Under random circumstances a database corruption happens such that a SecondaryIntegrityException is reported upon subsequent opens. The code in question is complex and not amenable to producing an example, so I am left with speculating as to causes. I think I am following all of the rules as specified in the documentation for the SecondaryDatabase and SecondaryIntegrityException classes. The databases in question are transactional and I am using explicit transactions. I believe the secondary key creators are valid, however, I should point out that the primary data from which the keys are created can change so that the existing secondary database entries may be removed and new ones created.

      The generating condition seems to be related somehow to abnormal shutdown of the application--i.e. process failure. Under these circumstances it is conceivable that an in-progress transaction might be left open with the application having no opportunity to explicitly abort it. In addition to the scenarios described in the documentation, are there any other known ways of producing corrupt secondaries? Particularly, what happens if an explicit transaction is left open and not aborted because of a failure of some sort? Also, any suggestions as to methods for tracking down the origin of the problem--for catching it in the act?

      Thx.

      -Randy Clegg
        • 1. Re: SecondaryIntegrityException Causes?
          Greybird-Oracle
          In addition to the scenarios described in the documentation, are there any other known ways of producing corrupt secondaries? Particularly, what happens if an explicit transaction is left open and not aborted because of a failure of some sort?
          Just to double check, your secondary DBs are transactional also, correct?

          When a transaction is left open, it is the same as aborting it. It will be effectively aborted by recovery, when you next open the environment.

          The only think I can think of is that your secondary key creator has a bug of some kind, perhaps related to concurrency. The key creator must be thread-safe. You could post your key creator code to see if anyone can spot an error.
          Also, any suggestions as to methods for tracking down the origin of the problem--for catching it in the act?
          If you save your .jdb files when the corruption is first detected, it would be possible to analyze the log (using DbPrintLog to get human readable output) and see what transaction is responsible for the corruption. To ensure that the relevant .jdb files have not been cleaned and deleted, you would need to catch the problem early, perhaps by scanning by secondary key after a crash. Of course, that may not be possible in a production app, but maybe you can take a snapshot of the logs after a crash and do the scan offline. Or you could set EnvironmentConfig.CLEANER_EXPUNGE to false so that no log files are deleted, but this is also problematic in a production app (you may run out of disk space). Also, the analysis requires knowledge of the internal log format. If you want to try this, I'll give you some pointers. However, I don't recommend trying this unless you're familiar with the basic idea of write-ahead logging for transactions. This is probably best only as a last resort.

          --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
          • 2. Re: SecondaryIntegrityException Causes?
            932687
            Thanks for the quick response. Yes, the secondaries are also transactional. I will scrutinize the key creator some more to make sure it's really as thread-safe as I think it is. If I can rule that out, I will post that also.

            -Randy
            • 3. Re: SecondaryIntegrityException Causes?
              932687
              Thanks for the suggestions. After studying this for awhile, I've concluded that the issue was, indeed, related to the key creator. I believe it was not consistently generating the same values under all circumstances rather than being a concurrency issue.
              • 4. Re: SecondaryIntegrityException Causes?
                Greybird-Oracle
                Glad it's resolved. I appreciate you letting us know, so we're not wondering about a possible bug.
                --mark