This content has been marked as final. Show 11 replies
Are you asking about JE 4.1, 5.0, or both? They behave quite differently.
Oh, you're using JE 4.1, of course, since you've mentioned DBINs.
You probably have no DBINs in your log because you don't have any true duplicates. As I mentioned earlier, in JE 4.1, if there is a single record per key, the LN appears in a BIN. Does that explain why you see BINs in the log?
The higher level INs are written by checkpoints because dirtiness propagates up the tree. If a BIN is logged, it's parent IN is dirtied and written during the next checkpoint, etc. Does that explain why you see INs in the log?
Don't you see BINDeltas in the log? When an LN is written, there should normally be a delta logged instead of a BIN. Deltas are logged by each checkpoint until the delta threshold is exceeded (see related EnvironmentConfig properties), and then a full BIN is logged.
This is what I see from DBPrintLog
type,total count,provisional count,total bytes,min bytes,max bytes,avg bytes,entries as % of log
Total bytes in portion of log read: 62,901,748
Total number of entries: 11,212
Per checkpoint interval info:
lnTxn,ln,mapLNTxn,mapLN,end to end,end to start,start to end,maxLNReplay,ckptEnd,
I do see DBINs since as I said, my writes are guaranteed to be duplicates. But, I see that INs and BINs dominate the size of the log. That confuses me since, BIN should not be dirtied at all, in my case since I only change the duplicate tree rooted from DIN. Does a change to a DBIN also propagate up to the BIN and the IN??
I do see DBINs since as I said, my writes are guaranteed to be duplicates. But, I see that INs and BINs dominate the size of the log. That confuses me since, BIN should not be dirtied at all, in my case since I only change the duplicate tree rooted from DIN.When you said "My workload always does put() for an existing key" I thought you meant you were doing an update. Do you mean you're inserting a new record for a key that already exists?
In JE 4.1, inserts and deletions of a duplicate record will:
1) Write the LN
2) Dirty the DBIN, causing checkpoints to later write a series of deltas, then a full DBIN, then a full DIN, etc.
3) Write the DupCountLN, of which there is one per unique key, and which stores the count of records for that key.
4) Because the DupCountLN is a child of the BIN, the BIN is dirtied, causing checkpoints to later write a series of deltas, then a full BIN, then a full IN, etc.
(All of this is completely different and much less expensive in JE 5.)
Does a change to a DBIN also propagate up to the BIN and the IN??Yes, it does, due to both (2) and (4) above.
Cool. That helps.
The fact that you're not seeing any deltas is suspicious. Checkpoints normally write deltas, unless you are doing a sync() rather than a checkpoint, or specifying CheckpointConfig.setMinimizeRecoveryTime(true). You should allow the checkpoint to write deltas, to reduce the size occupied by BINs and DBINs.
Also, how often are you doing checkpoints?
Also, when doing this type of test, make sure that no eviction is occurring by checking the JE stats. Eviction of INs will throw off your numbers.
We don't do any explicit JE calls to sync or checkpoint or clean from our code. Here are our settings..
checkpoint.wakeup.interval = 30s
checkpointer.high.priority=true (This turns off the lazy migration for us)
The IN eviction is less than < 2-3%. I see that its checkpointing roughly 1.2 times per sec, I have ~100 puts/sec, Btree has 178M entries. Cleaner runs roughly 2 times per sec.
Checkpointer logs the dirtied entries, whereas the cleaner logs the "live" entries from a stale file. Just want to confirm that the migration by the cleaner, will indeed be factored in by the checkpointer to determine when to checkpoint..
More migration means more frequent checkpoints and more checkpoints mean more frequent cleaner runs (not necessarily migration), right?
I see that its checkpointing roughly 1.2 times per sec.That's an awful lot of checkpointing but still I don't understand why you're not seeing any BIN deltas at all. Could you post your env config?
Checkpointer logs the dirtied entries, whereas the cleaner logs the "live" entries from a stale file. Just want to confirm that the migration by the cleaner, will indeed be factored in by the checkpointer to determine when to checkpoint..Correct.
Another tidbit is that BIN deltas are not logged by the checkpointer for a BIN/DBIN that was dirtied by the cleaner. This could reduce the frequency of deltas, but it still shouldn't be zero.
More migration means more frequent checkpoints and more checkpoints mean more frequent cleaner runs (not necessarily migration), right?Correct. You may want to bump the checkpoint interval way up and turn off high priority checkpoints -- obviously you don't need this setting to avoid long checkpoints. The byte interval supersedes the wakeup interval.
BDB cache size = 104857600
BDB je.cleaner.threads = 3
BDB je.cleaner.minUtilization = 50
BDB je.cleaner.minFileUtilization = 20
BDB je.log.fileMax = 62914560
BDB checkpoint interval 20MB
and checkpointer_high_priority is actually false, my bad.
Basically, my test make a copy of an enviornment, brings up Voldemort on it, and runs a workload. When I do the print log on the entire set of JDB files, I do see BINDeltas. Its only in the later files (which correspond to the writes I was talking about), that I don't see them.
When I do the print log on the entire set of JDB files, I do see BINDeltas. Its only in the later files (which correspond to the writes I was talking about), that I don't see them.Oh, so you do see deltas. Perhaps it's just that in the last section the writes are dirtying more than 25% of the entries per BIN. (see EnvironmentConfig.TREE_BIN_DELTA)
yep. Thanks for the pointer mark.