I guess "com.sleepycat.je.LockTimeoutException: (JE 4.0.92) Lock expired. Locker 1723973692 124031744_voldemort-niosocket-server14_Txn: waited for lock on database=store1 LockAddr:924552162 node=87042505 type=WRITE grant=WAIT_NEW timeoutMillis=500 startTime=1349656528721 endTime=1349656529221" is the exception.toString() ?Nope. The toString gives you the stack trace of the owner and the waiters, due to the debug params you set.
Also, can you please explain the relationship to duplicates? Is this a known issue? I recall that you mentioned the locking of the dupCountLN before. But given that is only 1% of these exceptions, I am curious to know how duplicates impact the transaction locking.It is not a known issue, and I can only give you a little more information about what's happening based on the second exception you posted, the one that shows an owner and one waiter. For both the owner and the waiter, a lookup is being performed: BdbStorageEngine.put is calling Cursor.getSearchKey.
The app does have a lot of concurrent writes on the same key i.e multiple simultaeneous transactiions (where we do a searchKey, delete, put). So, is the issue because its waiting for the delete to be compressed?This is because the first record (first duplicate for the key) found was deleted but not yet compressed away (deleted records are compressed out of the Btree asynchronously).
The app does have a lot of concurrent writes on the same key i.e multiple simultaeneous transactiions (where we do a searchKey, delete, put). So, is the issue because its waiting for the delete to be compressed?No. The uncompressed/deleted record will simply be locked, examined, and skipped over by cursors. As I mentioned, in both stack traces the deleted record was already skipped over when the timeout occurred. I mention this because it tells you something about what's happening in the app: that the app deleted a record, and other threads are trying to access records with the same key, and this is all happening in a short time period.
If so, when does the compression happen or who does it?Compression of deleted records is performed in a JE background thread, but any delay in compressing a deleted record will not cause blocking.
I did not make any code changes for this. We just catch a DatabaseException and log it using log4j. https://github.com/voldemort/voldemort/blob/master/src/java/voldemort/store/bdb/BdbStorageEngine.java#L381You're logging just the toString of the exception here, since you're calling this log4j method:
Let me try to confirm the missing information, so I can try adding it to the code and roll it out eventually. So, the exception that is logged was thrown from one of the threads calling trying to do the transaction. Currently, JE just dumps out every stack trace except the erroring thread? So, doing a e.printStackTrace() on the exception thrown would complete the picture?Yes.