This discussion is archived
4 Replies Latest reply: Nov 27, 2012 12:03 PM by greybird RSS

CleanerBacklog with lazy_migration=false

vinothchandar Newbie
Currently Being Moderated
Hi,

We recently rewrote the BDB storage layer for Voldemort (as advised by the JE team) and seeing good improvements.. Idea is to move as much data as possible off the heap, and hence we attempted to set lazy_migration to off.
However, we noticed that for some databases with pretty low puts/sec (~100) and write patterns (a small set of keys written over and over again for different periods of time), the cleaner (1) does not seem to keep up.
We have plenty of IOPS though.

I have tried two things so far, that seem to help .

1. If I turn lazy_migration on however, backlog disappears right away. So wondering if there is a known cleaning issue in 4.1.x.

2. Setting CLEANER_BACKGROUND_PROACTIVE_MIGRATION on with lazy_migration=off, also helps. Some help on what exactly happens (esp its impact on foreground operations) when this is turned on would help.

3. BDB5 does not seem to have this issue. Again, it would be nice to know what improvements one can expect in BDB5 in terms of log cleaning, to make sense of this.

Some advice/clarification appreciated.

Thanks
Vinoth
  • 1. Re: CleanerBacklog with lazy_migration=false
    greybird Expert
    Currently Being Moderated
    Hi Vinoth,

    Welcome back from vacation.
    We recently rewrote the BDB storage layer for Voldemort (as advised by the JE team) and seeing good improvements.. Idea is to move as much data as possible off the heap, and hence we attempted to set lazy_migration to off.
    As you know I believe lazy migration should always be off. It has no advantage and several disadvantages.
    1. If I turn lazy_migration on however, backlog disappears right away. So wondering if there is a known cleaning issue in 4.1.x.
    I suspect it is not a cleaning issue, it is simply that JE 4.1 is much less efficient -- there is much more metadata writing -- and therefore more cleaner threads are needed to keep up.

    With lazy migration on, you've moved much of the cleaner's work to the checkpointer, and also the evictor and app threads if there is eviction. A single cleaner thread may be able to better keep up with lazy migration on, but that's because other threads are assisting. It is likely that overall throughtput is lower and/or checkpoints are taking a long time.
    2. Setting CLEANER_BACKGROUND_PROACTIVE_MIGRATION on with lazy_migration=off, also helps. Some help on what exactly happens (esp its impact on foreground operations) when this is turned on would help.
    Same thing. With proactive migration, the evictor threads and app threads are assisting with cleaning during eviction. It seems very likely that you have a lot of eviction.
    3. BDB5 does not seem to have this issue. Again, it would be nice to know what improvements one can expect in BDB5 in terms of log cleaning, to make sense of this.
    JE 5 writes much less metadata (Btree internal nodes) so less cleaning is necessary and less cleaner threads are necessary to keep up.
    Some advice/clarification appreciated.
    Perhaps with a low write rate performance is not a big consideration and you'd prefer to use lazy migration in order for the cleaner to keep up. But in general it is much more efficient to increase the number of cleaner threads and turn off lazy migration.

    --mark                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
  • 2. Re: CleanerBacklog with lazy_migration=false
    vinothchandar Newbie
    Currently Being Moderated
    Hi Mark,

    Yes. Good to be back :) .

    What you say makes sense, let me try lazy_migration=off and increase cleaners to see if it helps. On the side, I am going to scope out a way to get us to BDB5. Will keep you posted on this.

    Below are our cleaner/checkpointer configs.. Please let me know if you see anything odd..

    je.maxMemory=10737418240
    je.log.faultReadSize=2048
    je.env.isTransactional=true
    je.cleaner.maxBatchFiles=0
    je.cleaner.minUtilization=50
    je.cleaner.lookAheadCacheSize=8192
    je.lock.nLockTables=1
    je.cleaner.threads=1
    je.log.iteratorReadSize=8192
    je.lock.timeout=500 MILLISECONDS
    je.checkpointer.wakeupInterval=30000000
    je.checkpointer.highPriority=false
    je.checkpointer.bytesInterval=20971520
    je.cleaner.lazyMigration=true
    je.cleaner.minFileUtilization=0
    je.txn.durability=NO_SYNC,NO_SYNC,SIMPLE_MAJORITY
    je.sharedCache=true
    je.env.fairLatches=false
    je.log.fileMax=62914560

    Thanks
    Vinoth
  • 3. Re: CleanerBacklog with lazy_migration=false
    vinothchandar Newbie
    Currently Being Moderated
    Hi mark,

    I am testing BDB5 cleaning now and I see very good improvement in terms of amount of cleanerentries read.
    For a store, we have I see it dropping 20x.

    Thanks
    Vinoth
  • 4. Re: CleanerBacklog with lazy_migration=false
    greybird Expert
    Currently Being Moderated
    Awesome! Thanks for letting us know.
    --mark                                                                                                                                                                                                                   

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points