1 Reply Latest reply: Dec 2, 2013 8:41 AM by greybird RSS

    LockTimeoutException trying to promote a lock when using a Sequence?

    23bb75ba-864b-4afe-9d91-793ea98d2f3a

      Hi, I'm using JE 5.0.73

       

      I have 4 threads writing to a database and each thread attempts to get a value from the same sequence. Spuriously (as is the nature of such problems), I get a LockTimeoutException when the threads are calling com.sleepycat.je.Sequence.get(Sequence.java:341)

       

      Here is the output from the Exception for 3 of the threads (the other one doesn't throw an Exception and completes fine)

       

      Thread "TradeWriter-1":

      com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired. Locker 1163009300 3095544_TradeWriter-1_Txn: waited for lock on database=sequences LockAddr:130120503 LSN=0x67/0x565e32 type=WRITE grant=WAIT_PROMOTION timeoutMillis=10000 startTime=1385946530364 endTime=1385946540364

      Owners: [<LockInfo locker="1958359210 3095546_TradeWriter-3_Txn" type="READ"/>, <LockInfo locker="1163009300 3095544_TradeWriter-1_Txn" type="READ"/>, <LockInfo locker="301150106 3095545_TradeWriter-2_Txn" type="READ"/>]

      Waiters: []

      Transaction 1163009300 3095544_TradeWriter-1_Txn owns LockAddr:130120503 <LockInfo locker="1163009300 3095544_TradeWriter-1_Txn" type="READ"/>

      Transaction 1163009300 3095544_TradeWriter-1_Txn waits for LockAddr:130120503

       

      Thread "TradeWriter-0":

      com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired. Locker 1087485625 3095549_TradeWriter-0_Txn: waited for lock on database=sequences LockAddr:130120503 LSN=0x67/0x565e32 type=WRITE grant=WAIT_NEW timeoutMillis=10000 startTime=1385946540366 endTime=1385946550367

      Owners: [<LockInfo locker="1958359210 3095546_TradeWriter-3_Txn" type="READ"/>, <LockInfo locker="301150106 3095545_TradeWriter-2_Txn" type="READ"/>]

      Waiters: []

       

      Thread "TradeWriter-3":

      com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired. Locker 1958359210 3095546_TradeWriter-3_Txn: waited for lock on database=sequences LockAddr:130120503 LSN=0x67/0x565e32 type=WRITE grant=WAIT_PROMOTION timeoutMillis=10000 startTime=1385946550368 endTime=1385946560368

      Owners: [<LockInfo locker="1958359210 3095546_TradeWriter-3_Txn" type="READ"/>, <LockInfo locker="301150106 3095545_TradeWriter-2_Txn" type="READ"/>]

      Waiters: []

      Transaction 1958359210 3095546_TradeWriter-3_Txn owns LockAddr:130120503 <LockInfo locker="1958359210 3095546_TradeWriter-3_Txn" type="READ"/>

      Transaction 1958359210 3095546_TradeWriter-3_Txn waits for LockAddr:130120503

       

      If my interpretation is correct, it looks like two of the threads are trying to promote a Read lock to a Write lock, which they can't do because the other thread is reading, and this is causing a deadlock. Is this correct? If so, how can this be avoided? The exception stack trace for all three threads is exactly the same. Here's the top of it:

       

      at com.sleepycat.je.txn.LockManager.newLockTimeoutException(LockManager.java:664) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.LockManager.makeTimeoutMsgInternal(LockManager.java:623) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.SyncedLockManager.makeTimeoutMsg(SyncedLockManager.java:97) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.LockManager.lockInternal(LockManager.java:390) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.LockManager.lock(LockManager.java:276) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.Txn.lockInternal(Txn.java:498) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.txn.Locker.lock(Locker.java:443) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:2621) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.dbi.CursorImpl.lockLN(CursorImpl.java:2422) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.dbi.CursorImpl.searchAndPosition(CursorImpl.java:2150) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Cursor.searchInternal(Cursor.java:2666) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Cursor.searchAllowPhantoms(Cursor.java:2576) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Cursor.searchNoDups(Cursor.java:2430) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Cursor.search(Cursor.java:2397) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Cursor.getSearchKey(Cursor.java:1668) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Sequence.readData(Sequence.java:542) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Sequence.readDataRequired(Sequence.java:527) ~[je-5.0.73.jar:5.0.73]

        at com.sleepycat.je.Sequence.get(Sequence.java:341) ~[je-5.0.73.jar:5.0.73]

        • 1. Re: LockTimeoutException trying to promote a lock when using a Sequence?
          greybird

          Internally a Sequence uses LockMode.RMW when reading to avoid lock promotion problems, so that's not it.  It looks like you're passing a transaction to Sequence.get, which holds the lock until you commit the transaction.  I suspect you're using that transaction for other operations, and this may be deadlock prone.

           

          Even though Sequence.get allows passing a transaction for ultimate flexibility, normally this is not a good idea and has little benefit.  Instead, pass null to use auto-commit for updating the sequence, or use a separate transaction for updating the sequence.  If your user transaction aborts, the sequence won't be rolled back, but this is normally not a drawback -- it only means that in this rare case, some sequence values will be unused.

           

          If the reason you're passing a transaction is to control durability or to use a specific TransactionConfig, then create a transaction just for that purpose and pass it to Sequence.get, and commit it immediately afterwards.  Or, if you simply want NO_SYNC durability, open the Sequence database as a non-transactional database and pass null for the transaction param.  The latter approach is what I recommend, assuming that you're not using HA to replicate the Sequence database (in which case it would have to be transactional).

           

          --mark