12 Replies Latest reply: Jun 14, 2013 6:59 AM by 1005812 RSS

    BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar

    1005812
      hello,
      we are seeeing the following in our prodn environment when BDB uses a large amount of data
      we have already set the LOG_FILE_MAX to 1073 as per the DOC

           EnvironmentConfig envConfig = new EnvironmentConfig();
           
           envConfig.setConfigParam(EnvironmentConfig.LOG_FILE_MAX, "1073741824"); //1073M
           
           envConfig.setTransactional(true);
           envConfig.setAllowCreate(true);
           Environment env = new Environment(new File(cacheFileName), envConfig);


      Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.58)
      /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION fetchTarget of 0x20a/0x51c582e parent IN=5
      IN class=com.sleepycat.je.tree.BIN lastFullVersion=0x20a/0x5675c2 lastLoggedVersion=0x20b/0x569285 parent.getDirty()=tr
      ue state=2 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
      at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1429)
      at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
      at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2229)
      at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1434)
      at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1563)
      at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:445)
      at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:148)
      at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:162)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.io.FileNotFoundException: /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION/
      0000020a.jdb (No such file or directory)
      at java.io.RandomAccessFile.open(Native Method)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:101)
      at com.sleepycat.je.log.FileManager$6.<init>(FileManager.java:1282)
      at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1281)
      at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1147)
      at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1102)
      at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:808)
      at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:772)
      at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)
      at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
      at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2229)



      This is the snapshot of the Logs directory (HOME OF BERKLEY ENV) where the data is written to

      -rw-rw-r-- 1 fxall fxall 0 Apr 24 11:47 je.info.0.lck
      -rw-rw-r-- 1 fxall fxall 0 Apr 24 11:47 je.lck
      drwxr-xr-x 2 fxall fxall 4096 Apr 24 22:12 .
      -rw-rw-r-- 1 fxall fxall 25528851 Apr 24 22:12 0000020b.jdb
      -rw-rw-r-- 1 fxall fxall 5964732 Apr 24 22:12 je.info.0
      drwxr-xr-x 35 fxall fxall 258048 Apr 26 10:31 ..


      After some time it starts throwing exceptions . WHat can be done to fix this ? Thanks in advance ...
        • 1. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
          Bogdan Coman-Oracle
          Hi,

          Do you have a commercial JE license? If yes, I suggest you to create a Service Request with My Oracle Support for a faster resolution.
          1002809 wrote:
               envConfig.setConfigParam(EnvironmentConfig.LOG_FILE_MAX, "1073741824"); //1073M
          Any particular reason on why you decided to set it to the maximum size, of 1GB? While it shouldn't cause the error, I would like to make sure that you are running with a balanced configuration. What other JE parameters are you configuring and with what values? Do you configure the cache size or log file cache size?
          After some time it starts throwing exceptions . WHat can be done to fix this ?
          Can we see all those exceptions starting with the first one?
          -rw-rw-r-- 1 fxall fxall 25528851 Apr 24 22:12 0000020b.jdb
          Did you override the default value for EnvironmentConfig.CLEANER_MIN_AGE from 2 to 1? If not, you shouldn't end up with only one log file and most likely something else (not the JE cleaner) deleted a log file. Is it possible that this could happen?

          Is there any change in particular you made in the recent past that caused this error to occur?

          Thanks,
          Bogdan
          • 2. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
            1005812
            hello ,

            1) we set the following Configs

                      envConfig.setConfigParam(EnvironmentConfig.LOG_FILE_MAX, "1073741824"); //1073M
                 envConfig.setTransactional(true);
            envConfig.setAllowCreate(true);



            2) We dont set the cache size or log file cache size

            3) Below is the complete Stack Trace

            com.sleepycat.je.EnvironmentFailureException: (JE 5.0.58) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.58) /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION fetchTarget of 0x20a/0x51c582e parent IN=5 IN class=com.sleepycat.je.tree.BIN lastFullVersion=0x20a/0x5675c2 lastLoggedVersion=0x20b/0x569285 parent.getDirty()=true state=2 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
            at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:196)
            at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1510)
            at com.sleepycat.je.Database.checkEnv(Database.java:1906)
            at com.sleepycat.je.Database.openCursor(Database.java:676)
            at com.sleepycat.collections.CurrentTransaction.openCursor(CurrentTransaction.java:415)
            at com.sleepycat.collections.MyRangeCursor.openCursor(MyRangeCursor.java:53)
            at com.sleepycat.collections.MyRangeCursor.<init>(MyRangeCursor.java:29)
            at com.sleepycat.collections.DataCursor.init(DataCursor.java:170)
            at com.sleepycat.collections.DataCursor.<init>(DataCursor.java:58)
            at com.sleepycat.collections.StoredContainer.getValue(StoredContainer.java:300)
            at com.sleepycat.collections.StoredMap.get(StoredMap.java:240)
            at com.fxall.ems.berkleyDB.BerkleyDBInterface.getExistingObjectFromCache(BerkleyDBInterface.java:371)
            at com.fxall.ems.persistance.BerkleyDBPersist.getIds(BerkleyDBPersist.java:125)
            at com.fxall.ems.persistance.BerkleyDBPersist.getExecMsgIds(BerkleyDBPersist.java:105)
            at com.fxall.ems.sor.SORGateway.isDoNotSendToGUI(SORGateway.java:9658)
            at com.fxall.ems.sor.SORGateway.send(SORGateway.java:9543)
            at com.fxall.ems.sor.meclient.MEClient.handleReplacedOrderReport(MEClient.java:455)
            at com.fxall.ems.sor.meclient.MEOutHandler.onOrderReplaced(MEOutHandler.java:198)
            at com.fxall.meclient.MEOutBoundWorker.onMessage(MEOutBoundWorker.java:84)
            at com.fxall.meclient.MEOutBoundWorker.onMessage(MEOutBoundWorker.java:30)
            at com.fxall.meclient.OutBoundWorker$1.receivedMessage(OutBoundWorker.java:30)
            at com.fxall.meclient.OutBoundWorker$1.receivedMessage(OutBoundWorker.java:23)
            at com.fxall.marketplace.middleware.socket.ABaseQueueMessageListener$Poller.execute(ABaseQueueMessageListener.java:148)
            at com.fxall.marketplace.middleware.socket.AControlledTask.run(AControlledTask.java:172)
            at java.lang.Thread.run(Thread.java:662)
            Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.58) /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION fetchTarget of 0x20a/0x51c582e parent IN=5 IN class=com.sleepycat.je.tree.BIN lastFullVersion=0x20a/0x5675c2 lastLoggedVersion=0x20b/0x569285 parent.getDirty()=true state=2 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.
            at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1429)
            at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
            at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2229)
            at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1434)
            at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1563)
            at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:445)
            at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:148)
            at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:162)
            ... 1 more
            Caused by: java.io.FileNotFoundException: /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION/
            0000020a.jdb (No such file or directory)
            at java.io.RandomAccessFile.open(Native Method)
            at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
            at java.io.RandomAccessFile.<init>(RandomAccessFile.java:101)
            at com.sleepycat.je.log.FileManager$6.<init>(FileManager.java:1282)
            at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1281)
            at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1147)
            at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1102)
            at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:808)
            at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:772)
            at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)
            at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)
            at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2229)
            at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1434)
            at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1561)
            at com.sleepycat.je.cleaner.UtilizationProfile.getObsoleteDetail(UtilizationProfile.java:632)
            at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:445)
            at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:295)
            ... 3 more


            4 ) We dont set EnvironmentConfig.CLEANER_MIN_AGE anywhere in the application Code

            Edited by: 1002809 on Apr 26, 2013 12:41 PM
            • 3. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
              Bogdan Coman-Oracle
              Could it be that 0000020a.jdb got accidentally deleted (maybe during a backup/restore procedure)?
              Is this the first time you see the error?
              You didn't answer if there were any recent changes that lead to the LOG_FILE_NOT_FOUND error. Did you upgrade recently?

              Thanks,
              Bogdan
              • 4. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                1005812
                hello, we just deployed this build using BDB je 5.0.58 in prodn for the first time..
                we didn't see this problem in UAT but the data in UAT was much smaller

                2) THere are no processes that would delete 0000020a.jdb accidentally.. Is it possible there is a bug in je 5.0.58 for handling larger data ?
                • 5. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                  1005812
                  hello, we just deployed this build using BDB je 5.0.58 in prodn for the first time..
                  we didn't see this problem in UAT but the data in UAT was much smaller

                  2) THere are no processes that would delete 0000020a.jdb accidentally.. Is it possible there is a bug in je 5.0.58 for handling larger data ?
                  • 6. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                    Bogdan Coman-Oracle
                    1002809 wrote:
                    2) THere are no processes that would delete 0000020a.jdb accidentally.. Is it possible there is a bug in je 5.0.58 for handling larger data ?
                    Anything is possible, I wouldn't rule it out, however, I think it's unlikely. We don't know of any similar issues. I'm researching, I'll let you know if I find something.
                    In the meantime do you think you can run in production with lower log files?
                    Also, do you think you can help me reproduce this? Maybe with a test case, maybe provide more details about the platform (OS, filesystem) you're using.

                    Thanks,
                    Bogdan
                    • 7. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                      Bogdan Coman-Oracle
                      A similar issue which proved to be caused by the custom Btree comparator (don't know if you're using one) was reported in the past on the Forum for JE 5.0.55: LOG_FILE_NOT_FOUND bug possible in current BDB JE?
                      • 8. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                        1005812
                        Hello Bogdan,

                        we are running this on this Machine

                        1)
                        Linux nyatint03 2.6.32-358.2.1.el6.x86_64 #1 SMP Wed Feb 20 12:17:37 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

                        It is running the following Linux Kernel..
                        Red Hat Enterprise Linux Server release 6.4 (Santiago)

                        2) we dont beleive we are using any custom comparator etc ...

                        Thanks in advance
                        • 9. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                          Bogdan Coman-Oracle
                          Hi,

                          Thanks for answering my questions. I have additional questions for you:

                          1) We would like to know whether you upgraded the env from JE 4.1 or earlier (i.e., you did not start from scratch with 5.0). And if so, whether you ran the 4.1 upgrade utility first.

                          2) Any luck with reproducing this outside production?

                          3) Would you be willing to send us the 1GB .jdb file privately so we can investigate?

                          Thanks,
                          Bogdan
                          • 10. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                            1005812
                            Hello Bogdan,

                            1) we did start from scratch ie used 5.0.58 je Jar file the very first time in Production.

                            2) We are able to reproduce this in INT (UAT environment as well)

                            3) we will be able to send you .jdb file (Not an issue) . we could ftp it to you //
                            • 11. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                              Bogdan Coman-Oracle
                              1002809 wrote:
                              2) We are able to reproduce this in INT (UAT environment as well)
                              Great! Can you please try to reproduce it with the latest release as well? It has the following fix:
                              +Fixed a bug that caused data corruption (for example, resulting in an EnvironmentFailureException with a LOG_FILE_NOT_FOUND message) under certain circumstances, when many Databases are created and not all Databases are kept open. In fact two bugs were fixed that contributed to the problem. The first bug was a memory size calculation error and caused heavy/costly cache eviction, even when the cache would normally be large enough to avoid such eviction. This calculation error occurred during log cleaning, and only when closed databases had been previously evicted from the cache. The second bug caused Btree corruption when cache eviction was heavy and many closed databases were evicted. [#21686] (JE 5.0.61)+
                              http://download.oracle.com/otndocs/products/berkeleydb/html/je/je-5.0.73_changelog.html

                              >
                              3) we will be able to send you .jdb file (Not an issue) . we could ftp it to you //
                              Yes, sure. If it is reproducible with the latest release (5.0.73), please email me.

                              Thanks,
                              Bogdan

                              Edited by: Bogdan Coman on May 8, 2013 1:35 PM
                              • 12. Re: BDB  Exceptions when saving large amounts of Data using je 5.0.58.jar
                                1005812

                                Bogdan,

                                 

                                After trying to run the same test in INT, we see a different exception ...  This is running with je 5.0.73

                                 

                                 

                                Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (

                                JE 5.0.73) /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs/TG_REGION fetchTarget of 0x0/0

                                x3f327521 parent IN=289 IN class="com".sleepycat.je.tree.BIN lastFullVersion=0x1/0x317f1c61 lastLoggedVersion=0

                                x1/0x317f1c61 parent.getDirty()=false state=0 LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. En

                                vironment is invalid and must be closed.

                                        at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1429)

                                        at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)

                                        at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2261)

                                        at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1466)

                                        at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1595)

                                        at com.sleepycat.je.cleaner.UtilizationProfile.getObsoleteDetail(UtilizationProfile.java:632)

                                        at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:439)

                                        at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:289)

                                        at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:151)

                                        at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:163)

                                        ... 1 more

                                Caused by: java.io.FileNotFoundException: /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs

                                /TG_REGION/00000000.jdb (No such file or directory)

                                        at java.io.RandomAccessFile.open(Native Method)

                                        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)

                                        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:101)

                                        at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>(FileManager.java:3209)

                                        at com.sleepycat.je.utilint.DaemonThread.run(DaemonThread.java:163)

                                        ... 1 more

                                Caused by: java.io.FileNotFoundException: /opt/fxall/Aggie-1.10.103/TradingGateway_1.10.103/bin/../../../logs

                                /TG_REGION/00000000.jdb (No such file or directory)

                                        at java.io.RandomAccessFile.open(Native Method)

                                        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)

                                        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:101)

                                        at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>(FileManager.java:3209)

                                        at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3237)

                                        at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1280)

                                        at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1152)

                                        at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1141)

                                        at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:845)

                                        at com.sleepycat.je.log.LogManager.getLogEntryAllowInvisibleAtRecovery(LogManager.java:809)

                                        at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1412)

                                        at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1251)

                                        at com.sleepycat.je.dbi.CursorImpl.fetchCurrent(CursorImpl.java:2261)

                                        at com.sleepycat.je.dbi.CursorImpl.getCurrentAlreadyLatched(CursorImpl.java:1466)

                                        at com.sleepycat.je.dbi.CursorImpl.getNext(CursorImpl.java:1593)

                                        at com.sleepycat.je.cleaner.UtilizationProfile.getObsoleteDetail(UtilizationProfile.java:632)

                                        at com.sleepycat.je.cleaner.FileProcessor.processFile(FileProcessor.java:439)

                                        at com.sleepycat.je.cleaner.FileProcessor.doClean(FileProcessor.java:289)

                                        at com.sleepycat.je.cleaner.FileProcessor.onWakeup(FileProcessor.java:148)

                                        ... 2 more

                                 

                                It seems that Berkley DB is looking for a 00000000.jdb and that file doesnt seem to exist.. in the HOME ENV Directory

                                ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

                                 

                                [09:56:21][fxall@nyatint03 ~/logs/TG_REGION] ls -ltra

                                total 1045892

                                drwxr-xr-x 85 fxall fxall     892928 Jun 13 18:01 ..

                                -rw-rw-r--  1 fxall fxall 1069999349 Jun 13 19:11 00000001.jdb

                                drwxr-xr-x  2 fxall fxall       4096 Jun 13 19:11 .

                                -rw-rw-r--  1 fxall fxall      80633 Jun 13 19:11 00000002.jdb

                                 

                                 

                                 

                                Please let us know how to proceed further..