Forum Stats

  • 3,769,970 Users
  • 2,253,039 Discussions
  • 7,875,254 Comments

Discussions

Issue with DiskOrderedCursor and WriteDeferredDB 5.0.97

2689842
2689842 Member Posts: 2
edited Jun 13, 2014 8:27AM in Berkeley DB Java Edition

Hi,

I am using the following environment configuration:

final EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setAllowCreate(true);
envConfig.setSharedCache(true);
envConfig.setCachePercent(10);

envConfig.setConfigParam(EnvironmentConfig.LOG_FILE_MAX, "100000000")

with the following DB config:

final DatabaseConfig databaseConfig = new DatabaseConfig();
databaseConfig.setAllowCreate(true);
databaseConfig.setDeferredWrite(true);

to create the db calling:

environment.openDatabase(NoTransaction, databaseId, databaseConfig);

Then i am simply using put() to place records in the db and using no transaction.

I want to read the records back in any order as quick as possible so am using:

DiskOrderedCursorConfig cursorConfig = new DiskOrderedCursorConfig();

try (DiskOrderedCursor cursor = db.openCursor(cursorConfig)) {

DatabaseEntry nextKey = new DatabaseEntry();
DatabaseEntry nextData = new DatabaseEntry();
OperationStatus status = cursor.getNext(nextKey, nextData, null);

while (SUCCESS == status) {

//.do some stuff with entries

status = cursor.getNext(nextKey, nextData, null);}}

The problem i am having is that i can read the first record everytime and then i get the following:

com.sleepycat.je.EnvironmentFailureException: (JE 5.0.97) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication java.io.FileNotFoundException: F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.

    at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:210)

    at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1594)

    at com.sleepycat.je.dbi.DiskOrderedCursorImpl.checkEnv(DiskOrderedCursorImpl.java:234)

    at com.sleepycat.je.DiskOrderedCursor.checkState(DiskOrderedCursor.java:367)

    at com.sleepycat.je.DiskOrderedCursor.getNext(DiskOrderedCursor.java:324)

    at rbccm.workflow.dependency.resiliency.replication.BerkleyReplicatedThreadStackStore.allThreadStacksFor(BerkleyReplicatedThreadStackStore.java:133)

    at rbccm.workflow.dependency.resiliency.replication.ReplicatedThreadStackRepository.restore(ReplicatedThreadStackRepository.java:180)

    at rbccm.workflow.dependency.resiliency.replication.ReplicatedThreadStackRepository.restore(ReplicatedThreadStackRepository.java:169)

    at rbccm.workflow.dependency.resiliency.replication.ReplicatedThreadStackRestarter.tryRestore(ReplicatedThreadStackRestarter.java:67)

    at rbccm.workflow.dependency.resiliency.replication.ReplicatedThreadStackRestarter.restoreAndRestart(ReplicatedThreadStackRestarter.java:60)

    at rbccm.workflow.dependency.resiliency.replication.ThreadStackReplicationControllerAndMonitor.restoreAndPauseReplication(ThreadStackReplicationControllerAndMonitor.java:47)

    at rbccm.workflow.dependency.resiliency.ThreadStackResiliencyMonitorAndController.restoreAndPauseReplication(ThreadStackResiliencyMonitorAndController.java:90)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:601)

    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111)

    at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45)

    at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:235)

    at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)

    at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)

    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)

    at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:792)

    at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1486)

    at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)

    at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)

    at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)

    at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:847)

    at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:601)

    at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)

    at sun.rmi.transport.Transport$1.run(Transport.java:177)

    at sun.rmi.transport.Transport$1.run(Transport.java:174)

    at java.security.AccessController.doPrivileged(Native Method)

    at sun.rmi.transport.Transport.serviceCall(Transport.java:173)

    at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)

    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)

    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

    at java.lang.Thread.run(Thread.java:722)

    Suppressed: com.sleepycat.je.EnvironmentFailureException: (JE 5.0.97) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication java.io.FileNotFoundException: F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.

        at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:210)

        at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1594)

        at com.sleepycat.je.dbi.DiskOrderedCursorImpl.checkEnv(DiskOrderedCursorImpl.java:234)

        at com.sleepycat.je.DiskOrderedCursor.close(DiskOrderedCursor.java:214)

        at rbccm.workflow.dependency.resiliency.replication.BerkleyReplicatedThreadStackStore.allThreadStacksFor(BerkleyReplicatedThreadStackStore.java:139)

        ... 36 more

    Caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication java.io.FileNotFoundException: F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.

        at com.sleepycat.je.log.LogManager.getLogEntryHandleFileNotFound(LogManager.java:920)

        at com.sleepycat.je.dbi.DiskOrderedScanner.fetchEntry(DiskOrderedScanner.java:575)

        at com.sleepycat.je.dbi.DiskOrderedScanner.fetchAndProcessLNs(DiskOrderedScanner.java:511)

        at com.sleepycat.je.dbi.DiskOrderedScanner.scan(DiskOrderedScanner.java:261)

        at com.sleepycat.je.dbi.DiskOrderedCursorImpl$1.run(DiskOrderedCursorImpl.java:129)

    Caused by: java.io.FileNotFoundException: F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication\ffffffff.jdb (The system cannot find the file specified)

        at java.io.RandomAccessFile.open(Native Method)

        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)

        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:118)

        at com.sleepycat.je.log.FileManager$DefaultRandomAccessFile.<init>(FileManager.java:3179)

        at com.sleepycat.je.log.FileManager$6.createFile(FileManager.java:3207)

        at com.sleepycat.je.log.FileManager.openFileHandle(FileManager.java:1276)

        at com.sleepycat.je.log.FileManager.getFileHandle(FileManager.java:1148)

        at com.sleepycat.je.log.LogManager.getLogSource(LogManager.java:1199)

        at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:903)

        at com.sleepycat.je.log.LogManager.getLogEntry(LogManager.java:857)

        at com.sleepycat.je.log.LogManager.getLogEntryHandleFileNotFound(LogManager.java:918)

        ... 4 more

    [CIRCULAR REFERENCE:com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication java.io.FileNotFoundException: F:\calculationservice\workspace\workflow-uatcrttorqt2wf3\replication\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed.]

This only seems to happen i have put a large number of records into the DB as sometimes it does work succesfully. When performing the cursor scan nothing else is running against the db.

When I use a normal cursor with null tx and null config everything works fine regardless of the number of records..

Its also important to note that if i do a db.sync before the cursor scan I do not get an exception.

Please  could you advise?

Many thanks in advance

Dan

Tagged:

Answers

  • Hi,

    Thanks for reporting this issue. Yes, this is indeed a bug, specific to deferred-write databases and DiscOrderedCursor. We think we know what the problem is. We will work on a fix and let you know.

    Yes, db.sync does indeed "fix" the problem, but it is just a rather expensive workaround.

    Another thing that may help is to increase the size of the queue used by the DiscOrderedCursor (the queue size can be set via DiscOrderedCursorConfig; the default is 1000). However, this will, in general, consume more memory, and since the bug occurs when the queue gets full, you may still run into it.

    Best,

    Markos.

    User733634-Oracle
  • 2689842
    2689842 Member Posts: 2

    I have seen it work when the database has only a small number of entries is this because the queue is not getting full? if i set the queue to be the same as the number of entries would this help?

  • Hi again,

    Sorry for the late reply. We are very close to our next release deadline and things are rather busy these days.

    Yes, making the queue size the same as the number of records should fix the problem. But you may be consuming too much memory, depending on the number of records, their size, and also on the relative "speed" of the producer and consumer threads. Here is a rough explanation of what is going on:

    The producer thread is one created internally by the DiskOrderedCursor. It scans the DB and puts the records in the queue. The consumer thread is your application thread; the one that calls the getNext() method. It removes records from the queue. The producer will "immediately" put in the queue records that are already cached in memory, as long as the queue is not full. If the queue is full, it will remember the physical (on-disk) address of the record, and will read the record from disk at a later time. The problem with a deferred-write (DW) DB is that some records may have not been written at all to disk by the time the DiskOrderedCursor starts, and as a result, they have invalid disk addresses. Nevertheless , the current implementation will remember these invalid addresses (if the queue is full) and will crash when it tries to use them later.

    Our current simple fix is to just skip records that have never been written to disk. The fix will be in the next JE release (6.1), which should be available in about a month. Can you wait until then to receive the fix? In the mean time, you could use db.sync, or play with the queue size, or use a non-transactional, non-DW DB. Because DW can be used with any non-transactional DB, after you upgrade to JE-6.1, you can open and use this non-transactional DB in DW mode, without the need to convert any data.

    Notice that the above fix is compatible with the current API contract of DiskOrderedCursor, which states that the result of a DOS may be missing some records (please see the javadoc for DiskOrderedCursor). We have ideas about greatly reducing the DOS inconsistencies described in the javadoc, but I cannot say right now when those will be implemented (certainly not in 6.1).

    I hope this helps,

    Markos.

This discussion has been closed.