This discussion is archived
1 2 3 Previous Next 30 Replies Latest reply: Aug 23, 2012 3:15 AM by userBDBDMS RSS

Problems with increasing/decreasing cache size when live

897965 Newbie
Currently Being Moderated
Hello,

I have configured multiple environments which I'm compacting sequentially and to achieve this I allocate a bigger cache to the env currently being compacted as follows:

Initialization:

DB_ENV->set_cachesize(gbytes, bytes, 1); // Initial cache size.
DB_ENV->set_cache_max(gbytes, bytes); // Maximum size.

While live, application decreases cache of current env when finished and then increases cache of next env using:

DB_ENV->set_cachesize(gbytes, obytes, 0); // Decrease cache size of current env to initial size
DB_ENV->set_cachesize(gbytes, obytes, 0); // Increase cache size of next env to max size.

When I print statistics about the memory pool using DB_ENV->memp_stat I can see that everyting is going normally:

memp_stat: env1 ncache= 8 cache_size=20973592 // env1 is current env
memp_stat: env2 ncache= 1 cache_size=20973592

and then after changing current env:

memp_stat: env1 ncache= 1 cache_size=20973592
memp_stat: env2 ncache= 8 cache_size=20973592 // env2 is now current env

But the problem is that over time memory is leaked (as if the extra memory of each env was not freed) and I'm totally sure that the problem comes from this code.
I'm running Berkeley DB 4.7.25 on FreeBSD.

Maybe some leak was fixed in newer versions and you could suggest to me a patch? or I don't use the API correctly?
Thanks!

Edited by: 894962 on Jan 23, 2012 6:40 AM
  • 1. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi;
    I'm also wondering if some leaks have been fixed in DB->compact code?
    I tried to diff code with a newer version but changes are too big...
    Thanks!
  • 2. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi again;
    I tried to update Berkeley DB to 5.0, upgraded logs and restarted my application.
    Now I have a SIGSEGV a really short time after changing current env of my compaction process (and thus decrease/increase cache sizes as written in the first post). I can reproduce this behavior every time...

    Here is my trace:

    Program terminated with signal 11, Segmentation fault.
    #0 __memp_fput (dbmfp=0x11a62b780, ip=0x0, pgaddr=0x214155058,
    priority=DB_PRIORITY_UNCHANGED)
    at /place/home/bdb5.0/mp/mp_fput.c:123
    in /place/home/bdb5.0/mp/mp_fput.c
    #0 __memp_fput (dbmfp=0x11a62b780, ip=0x0, pgaddr=0x214155058,
    priority=DB_PRIORITY_UNCHANGED)
    at /place/home/bdb5.0/mp/mp_fput.c:123
    #1 0x00000000007d11cd in __bam_search (dbc=0x20ab0e400,
    root_pgno=<value optimized out>, key=<value optimized out>,
    flags=<value optimized out>, slevel=<value optimized out>,
    recnop=<value optimized out>, exactp=0x7ffffeff88d4)
    at /place/home/bdb5.0/btree/bt_search.c:796
    #2 0x00000000007be326 in __bamc_search (dbc=0x20ab0e400,
    root_pgno=<value optimized out>, key=<value optimized out>, flags=14,
    exactp=<value optimized out>)
    at /place/home/bdb5.0/btree/bt_cursor.c:2787
    #3 0x00000000007beeb0 in __bamc_put (dbc=0x20ab0e400,
    key=<value optimized out>, data=<value optimized out>, flags=14,
    pgnop=<value optimized out>)
    at /place/home/bdb5.0/btree/bt_cursor.c:2132
    #4 0x000000000075cfab in __dbc_iput (dbc=0xde86b800, key=0x7ffffeff8dd0,
    data=0x7ffffeff8da0, flags=14)
    at /place/home/bdb5.0/db/db_cam.c:2115
    #5 0x000000000075f5ad in __dbc_put (dbc=0x20ab0e400,
    key=<value optimized out>, data=<value optimized out>,
    flags=<value optimized out>)
    at /place/home/bdb5.0/db/db_cam.c:2028
    #6 0x0000000000759d3e in __db_put (dbp=0x12f7b0800, ip=<value optimized out>,
    txn=<value optimized out>, key=0x7ffffeff8dd0, data=0x7ffffeff8da0,
    flags=65536)
    at /place/home/bdb5.0/db/db_am.c:498
    #7 0x0000000000763a8b in __db_put_pp (dbp=0x12f7b0800, txn=0x224b4f070,
    key=0x7ffffeff8dd0, data=0x7ffffeff8da0, flags=0)
    at /place/home/bdb5.0/db/db_iface.c:1597
    #8 0x000000000074118b in Db::put (this=0x1174b46b0, txnid=0x0,
    key=0x7ffffeff8dd0, value=0x7ffffeff8da0, flags=336941144)
    at /place/home/bdb5.0/cxx/cxx_db.cpp:347

    Maybe it helps to understand what's wrong in my code?
    Thanks.
  • 3. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi again;
    I tried to update to Berkeley Db 5.3 and I have a SEGV while trying to decrease the cache size.

    Program terminated with signal 11, Segmentation fault.
    #0 0x00000000007ac38e in __memp_remove_region (dbmp=<optimized out>) at /place/home/bdb5.3/src/mp/mp_resize.c:453
    453 hp = R_ADDR(infop, ((MPOOL*)infop->primary)->htab);
    (gdb) bt
    #0 0x00000000007ac38e in __memp_remove_region (dbmp=<optimized out>) at /place/home/bdb5.3/src/mp/mp_resize.c:453
    #1 __memp_resize (dbmp=0x33526a00, gbytes=<optimized out>, bytes=<optimized out>) at /place/home/bdb5.3/src/mp/mp_resize.c:547
    #2 0x00000000007a839f in __memp_set_cachesize (dbenv=<optimized out>, gbytes=0, bytes=104512, arg_ncache=48)
    at /place/home/bdb5.3/src/mp/mp_method.c:176

    Is there anyone here to help?
    This is the last version...
    Thanks.
  • 4. Re: Problems with increasing/decreasing cache size when live
    526060 Explorer
    Currently Being Moderated
    Hi,

    Sorry you are having problems. Upgrading to the latest version is definitely the first step I'd recommend.

    One of the most common issues encountered when upgrading to a newer version of Berkeley DB is that an old version of the db.h header file is used when building your application. That can lead to unexpected segfaults due to incorrect flag values and function pointers. Could you please confirm that you are building with the correct db.h version.

    If you still see the SEGV could you please post a full stack trace, as well as a more detailed description of how to reproduce the crash (or ideally some source code that can be used to reproduce the issue).

    Regards,
    Alex Gorrod
    Oracle Berkeley DB
  • 5. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,

    I upgraded to the lastest version (5.3), I'm building with right headers and actually my application runs normally if to disable this code that leaked before (4.7) and now crashes.
    I still have SEGV...

    My setup is quite simple: I have few env and each of them has few databases. I want to compact the biggest database of each env but don't have enough RAM to allocated enough cache for DB->compact to run smoothly. So while my application is live code dynamically increase cache_size of the env currently being compacted.

    The problem is that although I can increase cache_size (to cache_max), I cannot decrease cache_size after (see previous stack trace).
    Here are my questions:
    1. Is it true that as documentation states I can dynamically resize cache (including decreasing it)?
    2. Should I perform any operation on the env prior to resizing like flushing, locking, ...?

    Here are my env (open) flags: DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL | DB_INIT_TXN | DB_RECOVER | DB_PRIVATE | DB_THREAD;
    Here are my env flags: 0 | DB_AUTO_COMMIT | DB_TXN_NOSYNC;

    One strange thing that I observed is the following:

    // Before open
    DBEnvironment->set_cache_max(1*1024*1024*1024, 0);
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=1 and obytes=0

    // After open
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=8355840

    But the weirdest is that if I set_cachesize to the last value get_cache_max gives me after opening (i.e. 8355840), then cache is actually increased to 1GB (1376649216) as printed by memp_stat function: sp->st_ncache * (sp->st_gbytes * GIGA + sp->st_bytes).

    Looks like some kind of bug?

    Here is full stack trace:
    #0 0x00000000007ac3be in __memp_resize (dbmp=0x33526a00, gbytes=Variable "gbytes" is not available.
    ) at /place/home/bdb5.3/src/mp/mp_resize.c:453
    453 hp = R_ADDR(infop, ((MPOOL*)infop->primary)->htab);
    (gdb) bt
    #0 0x00000000007ac3be in __memp_resize (dbmp=0x33526a00, gbytes=Variable "gbytes" is not available.
    ) at /place/home/bdb5.3/src/mp/mp_resize.c:453
    #1 0x00000000007a83cf in __memp_set_cachesize (dbenv=Variable "dbenv" is not available.
    ) at /place/home/bdb5.3/src/mp/mp_method.c:176
    #2 0x000000000074ba24 in DbEnv::set_cachesize (this=0x4e2d67a0, gbytes=0, bytes=4177920, ncache=0) at /place/home/bdb5.3/lang/cxx/cxx_env.cpp:914

    Here I tried to decrease cachesize to cache_max (=8355840) / 2 after increasing it (went okay).
    Thanks!

    Edited by: 894962 on Jan 27, 2012 1:33 AM
  • 6. Re: Problems with increasing/decreasing cache size when live
    526060 Explorer
    Currently Being Moderated
    Hi,

    Thanks for the additional information. I will investigate further and provide more information.

    One question: Can you successfully decrease the cache size if you have not used all of the cache?

    Regards,
    Alex Gorrod
    Oracle Berkeley DB
  • 7. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,

    I believe the answer to your question is NO, as even with a big cache (2GB) and with no compaction, I reproduced the problem.
    Thanks.
  • 8. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,

    Interestingly I managed to reproduce the crash even with a single thread. Looks like decreasing the cache size does not work at all after opening the environment...
  • 9. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,

    Could you reproduce the problem?
    Thanks!
  • 10. Re: Problems with increasing/decreasing cache size when live
    526060 Explorer
    Currently Being Moderated
    Hi,

    Yes - we can see a problem with the cache resizing. We are working to understand the issue and will report back to you when we have further information.

    Regards,
    Alex Gorrod
    Oracle Berkeley DB
  • 11. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,
    Any news about this problem?
    Thanks!
  • 12. Re: Problems with increasing/decreasing cache size when live
    Oracle,CindyZeng Newbie
    Currently Being Moderated
    Hi,

    Thanks for providing the information.

    I am investigating on this issue. May I know more details of the case you provide?
    // Before open
    DBEnvironment->set_cache_max(1*1024*1024*1024, 0);
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=1 and obytes=0
    [Q] What do you set in set_cachesize() before open, including cache size and the number of caches?
    // After open
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=8355840

    But the weirdest is that if I set_cachesize to the last value get_cache_max gives me after opening (i.e. 8355840), then cache is actually increased to 1GB (1376649216) as printed by memp_stat function: sp->st_ncache * (sp->st_gbytes * GIGA + sp->st_bytes).
    [Q] After open, what is the number of caches along with the cache size (i.e. 8355840) in resizing the cache, before you get the cache size (1376649216) in memp_stat?

    And for the case listed in the beginning of the post
    While live, application decreases cache of current env when finished and then increases cache of next env using:
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Decrease cache size of current env to initial size
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Increase cache size of next env to max size.
    When I print statistics about the memory pool using DB_ENV->memp_stat I can see that everyting is going normally:
    memp_stat: env1 ncache= 8 cache_size=20973592 // env1 is current env
    memp_stat: env2 ncache= 1 cache_size=20973592
    and then after changing current env:
    memp_stat: env1 ncache= 1 cache_size=20973592
    memp_stat: env2 ncache= 8 cache_size=20973592 // env2 is now current env
    When env1 is finishing soon, what numbers do you set in set_cachesize to decrease the cache, including the number of caches and cache size?

    Thanks!
  • 13. Re: Problems with increasing/decreasing cache size when live
    897965 Newbie
    Currently Being Moderated
    Hi,
    Thanks for you answer.
    Unfortunately, I don't remember exact test case I was doing, so I did a new one with 32 env.
    I set the following for each env:
    - Initial cache=512MB/32
    - Max=1GB
    Oracle, Cindy Zeng wrote:

    [Q] What do you set in set_cachesize() before open, including cache size and the number of caches?
    Before open, I do:

    DBEnvironment->set_cachesize((u_int32_t)0, (u_int32_t)512*1024*1024/32, 1);

    DBEnvironment->set_cache_max(1*1024*1024*1024, 0);
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=1 and obytes=0

    >
    [Q] After open, what is the number of caches along with the cache size (i.e. 8355840) in resizing the cache, before you get the cache size (1376649216) in memp_stat?
    After open, I have the following:

    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=9502720
    memp_stat: cache_size=18644992 cache_ncache=1

    So here, the values returned by memp_stat are normal but get_cache_max is strange. Then after increasing the cache to the strange value returned by get_cache_max (gbytes=0, obytes=9502720), I have the following:

    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=9502720
    memp_stat: outlinks cache_size=27328512 cache_ncache=54

    with cache_size being: ((ui64)sp->st_gbytes * GIGA + sp->st_bytes);.

    So cache is actually increased...
    And for the case listed in the beginning of the post
    While live, application decreases cache of current env when finished and then increases cache of next env using:
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Decrease cache size of current env to initial size
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Increase cache size of next env to max size.
    When I print statistics about the memory pool using DB_ENV->memp_stat I can see that everyting is going normally:
    memp_stat: env1 ncache= 8 cache_size=20973592 // env1 is current env
    memp_stat: env2 ncache= 1 cache_size=20973592
    and then after changing current env:
    memp_stat: env1 ncache= 1 cache_size=20973592
    memp_stat: env2 ncache= 8 cache_size=20973592 // env2 is now current env
    When env1 is finishing soon, what numbers do you set in set_cachesize to decrease the cache, including the number of caches and cache size?
    When decreasing the cache, I do:

    env->GetDbEnv()->set_cachesize((u_int32_t)0, (u_int32_t)20973592, 0);

    I mean, in all cases I simply set cachesize to its original value (obtained after open through get_cachesize) when decreasing and set cachesize to its max value when increasing (obtained though get_cache_max; plus I do something like cacheMaxSize * 0.75 if < 500MB).

    Hope that helps.
    We can continue by email if it's more convenient.
    Thanks!
  • 14. Re: Problems with increasing/decreasing cache size when live
    Oracle,CindyZeng Newbie
    Currently Being Moderated
    Hi,

    Thanks for providing the information.
    Unfortunately, I don't remember exact test case I was doing, so I did a new one with 32 env.
    I set the following for each env:
    - Initial cache=512MB/32
    - Max=1GB

    Before open, I do:
    DBEnvironment->set_cachesize((u_int32_t)0, (u_int32_t)512*1024*1024/32, 1);
    DBEnvironment->set_cache_max(1*1024*1024*1024, 0);
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=1 and obytes=0

    After open, I have the following:
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=9502720
    memp_stat: cache_size=18644992 cache_ncache=1

    So here, the values returned by memp_stat are normal but get_cache_max is strange. Then after increasing the cache to the strange value returned by get_cache_max (gbytes=0, obytes=9502720), I have the following:
    DBEnvironment->get_cache_max(&gbytes, &obytes); // gives gbytes=0 and obytes=9502720
    memp_stat: outlinks cache_size=27328512 cache_ncache=54

    with cache_size being: ((ui64)sp->st_gbytes * GIGA + sp->st_bytes);.
    So cache is actually increased...
    I try to reproduce this case by opening 1 env as follows.

    //Before open
    DbEnv->set_cachesize(); 512MB, 1 cache
    DbEnv->set_cache_max; 1GB

    //After open
    DbEnv->get_cachesize; 512MB, 1cache
    DbEnv->get_caceh_max; 1GB
    memp_stat: cache:512MB, ncache:1, cache_max:1GB

    //Decrease the cache size
    DbEnv->set_cachesize(); 9MB(9502720B), 1 cache
    DbEnv->get_cachesize; 512MB, 1cache
    DbEnv->get_caceh_max; 1GB
    memp_stat: cache:512MB, ncache:1, cache_max:1GB

    All the result is expected. Since when resizing the cache after DbEnv is open, it is rounded to the nearest multiple of the region size. Region size means the size of each region specified initially. Please refer to BDB doc: [http://docs.oracle.com/cd/E17076_02/html/api_reference/C/envset_cachesize.html|http://docs.oracle.com/cd/E17076_02/html/api_reference/C/envset_cachesize.html]. Here region size is 512MB/1cache = 512MB. And I don't think you can resize the cache smaller than 1 region.

    Since you are opening 32 env at the same time with 512MB cache and 1GB maximum for each, when the env is open, whether it can allocate as much as that specified for the cache, is dependent on the system. I am guess the number 9502720 got from get_cache_max after opening the env is probably based on the system and app request, the cache size you can get when opening the env.
    And for the case listed in the beginning of the post
    While live, application decreases cache of current env when finished and then increases cache of next env using:
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Decrease cache size of current env to initial size
    DB_ENV->set_cachesize(gbytes, obytes, 0); // Increase cache size of next env to max size.
    When I print statistics about the memory pool using DB_ENV->memp_stat I can see that everyting is going normally:
    memp_stat: env1 ncache= 8 cache_size=20973592 // env1 is current env
    memp_stat: env2 ncache= 1 cache_size=20973592
    and then after changing current env:
    memp_stat: env1 ncache= 1 cache_size=20973592
    memp_stat: env2 ncache= 8 cache_size=20973592 // env2 is now current env
    When env1 is finishing soon, what numbers do you set in set_cachesize to decrease the cache, including the number of caches and cache size?
    When decreasing the cache, I do:

    env->GetDbEnv()->set_cachesize((u_int32_t)0, (u_int32_t)20973592, 0);

    I mean, in all cases I simply set cachesize to its original value (obtained after open through get_cachesize) when decreasing and set cachesize to its max value when increasing (obtained though get_cache_max; plus I do something like cacheMaxSize * 0.75 if < 500MB).
    I can reproduce this case. And I think the result is expected. When using DBEnv->set_cachesize() to resize the cache after env is opened, the ncache para is ignored. Please refer to BDB doc here: [http://docs.oracle.com/cd/E17076_02/html/api_reference/C/envset_cachesize.html|http://docs.oracle.com/cd/E17076_02/html/api_reference/C/envset_cachesize.html] . Hence I don't think you can decrease the cache size by setting the number of cache to 0.

    Hope it helps.
1 2 3 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points