Forum Stats

  • 3,872,434 Users
  • 2,266,420 Discussions


when will Berkeley DB rebuild the B-tree Index if restart the application

J4rvin Member Posts: 3
edited Jul 6, 2018 8:28AM in Berkeley DB Java Edition

Hi, I'm using Berkeley DB JE HA as store engine in our application. There were about 200,000,000 records stored in DB.

Our CacheMode is Evict_LN. Without the cache, there would be lots of random disk I/O in our application, which will cause large performance degradation.

When updating our service, we need to restart the JVM. I wonder when will Berkeley DB rebuild the cache in memory. Thank you.



  • Aditya Tripathi-Oracle
    Aditya Tripathi-Oracle Member Posts: 4
    edited Jul 5, 2018 8:37AM

    In the context of your question, building the cache (B+ Tree) will be done almost at the beginning of restarting the process. Unless the cache is built, JE Environment is not ready for any operation. So, you can say that cache is built as part of the initialization of an JE HA environment. The B+ tree would actually be built from the .jdb files in the environment root.

    Also, to estimate the size of this cache (this cache is in the JVM heap), there is a utility called DbCacheSize which would give an estimate of how much memory would be required for the cache. It is recommended that at least all the internal nodes of the B+ tree should fit in the cache, given that Evict_LN cache mode is used. Accordingly JVM heap size can be set based on other heap requirements: application requirements of heap, internal JVM usages of heap like GC overhead heap space requirements etc.

  • J4rvin
    J4rvin Member Posts: 3
    edited Jul 6, 2018 8:28AM

    Thank you for your reply!

    The cache size is about 6~7 GB of 200,000,000 records by running the DbCacheSize utility. However, when I restarted the JE application, the Java heap only occupied 1 GB memory. Only after I traverse all the records via a simple cursor(not a diskorderedcursor), the heap gradually reached the estimated size, which seems like a cache warm up process. In je.stat.csv, only several hundreds of BINs were cached at the beginning, when the traverse process ends, the number of cached BINs reached about 2,800,000, which was corresponding to the records count 200,000,000 and a nodeMaxEntry 128.

    Maybe the  b+tree wouldn't be built until the records were random accessed.

This discussion has been closed.