I kind of suspect its creating the duplicate tree for every LN. The number of LNs and DINs and DBINs kind of match up, that leads me to believe this.DIN,32665083,4671368,260458817830,775,269045,7973,30.20941307672278
The version will never be the same for a delete followed by an insert. So, I don't think this will work.I was afraid of that.
I am pretty sure the problem is related to the transaction related fix you mentioned in JE forums. I tried placing the deletes and inserts in two separate transactions and the bloat seems to be not happening. Is it possible to have a fix for this rolled out? If yes, would it possibly clean up all the extra DBINs/DINs currently in the log too? Please let us know.No, it is not possible to change this behavior, because it is not a bug. It is the correct behavior with duplicates in JE 4.1. If we changed this behavior, it would cause other problems, such as the problem with aborts described in the original bug fix. The incorrect behavior may have resulted in better performance, in the particular case where you are doing the delete and insert in a single txn, but it could result in complete data loss for the environment (LOG_FILE_NOT_FOUND, as in the case of the abort bug). We cannot risk data loss.
I am afraid we cannot. The vector clock maintenance is the critical piece of Voldemort's consistency model. So, this is not really an option.Can you use two separate txns for the delete and insert?
I am sorry to disagree. Your explanation described why a DIN and DBIN must be created even if K does not have duplicates. But I still don't understand why DBIN and DINs must be continuously created for every such update and never cleaned up. In other words, N transactions like these produce N DINs/DBINs. This will just fill up the disk.No, it is not possible to change this behavior, because it is not a bug.