1 2 3 Previous Next 32 Replies Latest reply: Jan 2, 2008 7:08 AM by Greybird-Oracle Go to original post RSS
      • 15. Re: (JE 3.2.44) Couldn't open file
        536712
        Hello!

        Did you have a chance to take a look at this problem and resolve it, probably you have a patch we could use? This issue is critical for us, we finished with a workaround to re-create the database and re-start the jobs, however this isn't a good solution since the jobs could take several days to perform operations and complete.
        • 16. Re: (JE 3.2.44) Couldn't open file
          Greybird-Oracle
          Eugene,

          We are actively working on this and giving it a high priority. We have reproduced the problem, but we don't have a fix yet. We will let you know as soon as we have a fix.

          --mark                                                                                                                                                                                                                                                                                                                                                                                                   
          • 17. Re: (JE 3.2.44) Couldn't open file
            614162
            Hi,

            Has there been any progress on this? I'm running into what seems to be the same situation. Our bdb is updated over time and it seems like a file is cleaned up but a reference is still being resolved to the missing file.

            I'm working on getting more output that'll be helpful but I'm hoping that a fix has been found (I'm currently using je-3.2.44 as well).

            --Tyler                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
            • 18. Re: (JE 3.2.44) Couldn't open file
              Greybird-Oracle
              Hi,

              We do have a fix for the original exception at the beginning of this thread:

              com.sleepycat.je.DatabaseException: (JE 3.2.44) Database relations_database14269 id=26 rootLsn=0xffffffff/0xffffffff IN type=DBIN/2 id=29976291 not expected on INList

              We assume, but have not proved, that this subsequently causes a LogFileNotFoundException. The best way for us to know whether the fix applies to your problem is for you to post the stack trace you're getting. Be sure to post the first exception you see, since other exceptions could be side effects. If you're getting something similar to the above exception, then it is very likely that you're seeing the same problem.

              The release with this fix will be available shortly.

              --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
              • 19. Re: (JE 3.2.44) Couldn't open file
                Greybird-Oracle
                I forgot to mention that the fix I referred to is for a problem that only occurs if you're calling Environment.removeDatabase or truncateDatabase. In any case, please post the stack trace you're seeing.

                --mark                                                                                                                                                                                                                                                                                                                                                                                                                                           
                • 20. Re: (JE 3.2.44) Couldn't open file
                  614162
                  Hmmm, the error I was seeing (see below) was closer to the second one. This might be a different issue. I'm never calling Environment.removeDatabase or truncateDatabase

                  11:44:33,795 ERROR ConditionalProbabilityModelManager:69 - Error getting entry com.sleepycat.je.DatabaseException?: (JE 3.2.44) fetchTarget of 0x4e/0x8d2c97 parent IN=35342049 lastFullVersion=0x5e/ 0x5151e9 parent.getDirty()=false state=0 com.sleepycat.je.log.LogFileNotFoundException?: (JE 3.2.44) 0x4e/0x8d2c97 (JE 3.2.44) Couldn't open file /data/env/runtime/1198004400022/0000004e.jdb: /data/env/runtime/1198004400022/0000004e.jdb (No suchfile or directory)
                  at com.sleepycat.je.tree.IN.fetchTarget(IN.java:963) at com.sleepycat.je.dbi.CursorImpl?.searchAndPosition(CursorImpl?.java:1965) at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1188) at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1158) at com.sleepycat.je.Cursor.search(Cursor.java:1024) at com.sleepycat.je.Database.get(Database.java:548) at com.rr.bdb.BDBUtils.getEntry(BDBUtils.java:96) at com.rr.bdb.BaseModelManager?.getEntry(BaseModelManager?.java:104) at
                  • 21. Re: (JE 3.2.44) Couldn't open file
                    Greybird-Oracle
                    Hmmm, the error I was seeing (see below) was closer
                    to the second one. This might be a different issue.
                    I'm never calling Environment.removeDatabase or
                    truncateDatabase
                    Since you're not using removeDatabase or truncateDatabase, it's a different problem and not one we are aware of. The exception is a side effect of the real problem, which would have occurred much earlier, making it difficult to diagnose.

                    A couple initial questions:

                    1) Are you using DatabaseConfig.setDeferredWrite(true)?
                    2) Are you using DatabaseConfig.setSortedDuplicates(true)?
                    3) Was your entire log written using JE 3.2.44, or were older log files written using earlier versions of JE? If the latter, which versions of JE?

                    To debug this, could you please do the following:

                    A) Set je.cleaner.expunge to false using your je.properties file or using EnvironmentConfig.setConfigParam.
                    B) Reproduce the problem.
                    C) Save the entire set of log files.
                    D) Send us email (mark.hayes) with the exceptions you get, and we'll ask you to upload specific log files or all log files if there aren't too many of them.

                    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
                    • 22. Re: (JE 3.2.44) Couldn't open file
                      614162
                      Thanks for the quick response!

                      I'll answer what I can now and try and get the rest of the info as soon as I can.

                      1. We are using DatabaseConfig.setDeferredWrite(true)
                      2. We are not using DatabaseConfig.setSortedDuplicates(true)
                      3. JE 3.2.44 was used for the entire log/bdb

                      Our process is essentially broken into two parts:
                      a. one server reads from a DB and creates or updates a bdb. (it creates a new one from scratch every night and then updates every hour). Once the bdb has been created/updated is published to another server.
                      b. the second server opens the bdb as read only and uses it as as datasource

                      I've seen this error on both servers (server a reads and writes) after a few updates have occured. I'm going to set up a test env to reproduce this and will upload the files as soon as I can produce 'em.

                      Thanks,
                      Tyler
                      • 23. Re: (JE 3.2.44) Couldn't open file
                        536712
                        2) Are you using DatabaseConfig.setSortedDuplicates(true)?
                        We are using this for secondary databases, if that matters.
                        • 24. Re: (JE 3.2.44) Couldn't open file
                          Greybird-Oracle
                          Our process is essentially broken into two parts:
                          a. one server reads from a DB and creates or updates
                          a bdb. (it creates a new one from scratch every night
                          and then updates every hour). Once the bdb has been
                          created/updated is published to another server.
                          b. the second server opens the bdb as read only and
                          uses it as as datasource
                          When you say "published to another server" do you mean that the Environment directory is copied, or is it shared via a network file system?

                          If you're copying the files and JE operations may be active during the copy, be sure to follow the rules described for the DbBackup utility:

                          http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/je/util/DbBackup.html

                          If you're sharing the directory over the network, you'll have to make sure that log cleaning does not occur while the reader process is active. See:

                          http://www.oracle.com/technology/products/berkeley-db/faq/je_faq.html#1

                          Both of the above are possible reasons you could get LogFileNotFound in the reader process. They wouldn't explain how you could get LogFileNotFound in the writer process.

                          --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
                          • 25. Re: (JE 3.2.44) Couldn't open file
                            614162
                            Good question.

                            We are copying over the entire directory. There are no JE operations during this process. Before publishing we are syncing and closing all databases and the environment first.

                            So I don't think that's the issue, especially since we see it on the writing side as well.
                            • 26. Re: (JE 3.2.44) Couldn't open file
                              Greybird-Oracle
                              OK, thanks. Then we'll have to look at log files.
                              --mark                                                                                                                                                                                                       
                              • 27. Re: (JE 3.2.44) Couldn't open file
                                614162
                                So I'm able to recreate this locally by looking up a specific value. I set:

                                envConfig.setConfigParam("je.cleaner.expunge", "false");

                                The database within this environment is named "cp_cp_vp_2"

                                The key was a long binding of 80301

                                the stack trace looks the same:

                                0:20:27,593 ERROR ConditionalProbabilityModelManager:69 - Error getting entry for site 2 and product 80301
                                com.sleepycat.je.DatabaseException: (JE 3.2.44) fetchTarget of 0xc2/0x83905b parent IN=42160007 lastFullVersion=0xdf/0x499f79 parent.getDirty()=false state=0 com.sleepycat.je.log.LogFileNotFoundException: (JE 3.2.44) 0xc2/0x83905b (JE 3.2.44) Couldn't open file \data\rrEnv\runtime\1198141200096\000000c2.jdb: \data\rrEnv\runtime\1198141200096\000000c2.jdb (The system cannot find the file specified)
                                     at com.sleepycat.je.tree.IN.fetchTarget(IN.java:963)
                                     at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:1965)
                                     at com.sleepycat.je.Cursor.searchInternal(Cursor.java:1188)
                                     at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:1158)
                                     at com.sleepycat.je.Cursor.search(Cursor.java:1024)
                                     at com.sleepycat.je.Database.get(Database.java:548)
                                     at com.rr.bdb.BDBUtils.getEntry(BDBUtils.java:97)
                                     at com.rr.bdb.BaseModelManager.getEntry(BaseModelManager.java:104)

                                The bbd in entirety is over 70MB zipped. The directory listing is:

                                [candiru@bertha 1198141200096]$ ls -l
                                total 234856
                                -rw-rw-r-- 1 candiru candiru 9999629 Dec 20 01:00 0000000a.jdb
                                -rw-rw-r-- 1 candiru candiru 10000000 Dec 20 01:00 000000c3.jdb
                                -rw-rw-r-- 1 candiru candiru 9999881 Dec 20 01:00 000000c4.jdb
                                -rw-rw-r-- 1 candiru candiru 9999977 Dec 20 01:00 000000c5.jdb
                                -rw-rw-r-- 1 candiru candiru 9999926 Dec 20 01:00 000000c6.jdb
                                -rw-rw-r-- 1 candiru candiru 9999847 Dec 20 01:00 000000c7.jdb
                                -rw-rw-r-- 1 candiru candiru 9999888 Dec 20 01:00 000000c8.jdb
                                -rw-rw-r-- 1 candiru candiru 9999975 Dec 20 01:00 000000c9.jdb
                                -rw-rw-r-- 1 candiru candiru 9998994 Dec 20 01:00 000000ca.jdb
                                -rw-rw-r-- 1 candiru candiru 9999957 Dec 20 01:00 000000cb.jdb
                                -rw-rw-r-- 1 candiru candiru 9999129 Dec 20 01:00 000000cc.jdb
                                -rw-rw-r-- 1 candiru candiru 9999840 Dec 20 01:00 000000cd.jdb
                                -rw-rw-r-- 1 candiru candiru 9999959 Dec 20 01:00 000000ce.jdb
                                -rw-rw-r-- 1 candiru candiru 9998690 Dec 20 01:00 000000cf.jdb
                                -rw-rw-r-- 1 candiru candiru 9999996 Dec 20 01:00 000000d6.jdb
                                -rw-rw-r-- 1 candiru candiru 9998399 Dec 20 01:06 000000d7.jdb
                                -rw-rw-r-- 1 candiru candiru 9999991 Dec 20 01:13 000000d8.jdb
                                -rw-rw-r-- 1 candiru candiru 9999559 Dec 20 01:13 000000d9.jdb
                                -rw-rw-r-- 1 candiru candiru 9999307 Dec 20 01:13 000000da.jdb
                                -rw-rw-r-- 1 candiru candiru 9999650 Dec 20 01:13 000000db.jdb
                                -rw-rw-r-- 1 candiru candiru 9999516 Dec 20 01:13 000000dc.jdb
                                -rw-rw-r-- 1 candiru candiru 9999823 Dec 20 01:13 000000dd.jdb
                                -rw-rw-r-- 1 candiru candiru 9999679 Dec 20 01:13 000000de.jdb
                                -rw-rw-r-- 1 candiru candiru 9998161 Dec 20 01:13 000000df.jdb
                                -rw-rw-r-- 1 candiru candiru 41465 Dec 20 01:13 000000e0.jdb
                                -rw-rw-r-- 1 candiru candiru 0 Dec 20 01:00 je.lck

                                -Tyler
                                • 28. Re: (JE 3.2.44) Couldn't open file
                                  Greybird-Oracle
                                  Hi Tyler,

                                  Setting expunge to false will cause log files to be renamed to .del rather than deleted by the log cleaner. In your directory listing you don't have any .del files and many files are deleted. So I assume you did not set expunge to false when this environment directory was originally created and written -- is that correct?

                                  I could take a look at what you have, but without the .del files it is likely I'll hit a dead end. Is it practical for you to recreate the problem from scratch with an empty environment directory, with expunge set to false?

                                  In any case, please email me, mark.hayes (the obvious .com), with further info so we can exchange log files, etc.

                                  Thanks,
                                  --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
                                  • 29. Re: (JE 3.2.44) Couldn't open file
                                    524788
                                    2) Are you using
                                    DatabaseConfig.setSortedDuplicates(true)?

                                    We are using this for secondary databases, if that
                                    matters.
                                    I'm seeing similar problems and am using sorted duplicates but not deferred writes. I see this problem on 3.2.23 however the database I see the issue with does not use sorted duplicates and the exception traces feature Database.delete and Database.get.

                                    I'm also making heavy use of transactions set up as follows:

                                    myConfig.setNoSync(true);
                                    myConfig.setNoWait(true);