1 2 Previous Next 20 Replies Latest reply: Sep 11, 2012 1:59 PM by greybird RSS

    Maximum DB Size?

      Mark, hello;

      can you please provide some kind of formula to estimate BDB JE limits?

      maximum database size?
      maximum number or records?
      maximum record size?
      maximum key size?

      thank you!

        • 1. Re: Maximum DB Size?
          Linda Lee

          There are no limits on key size, value size, number of records or database size. However, these sizes, along with your access pattern, do affect what resources your application will need.

          - key and data size impacts how much of the database can fit into cache. We have seen applications with a full range of key and record sizes, ranging from a few bytes to thousands of bytes, to megabytes. The larger the key, the smaller the amount of the internal better that can fit into cache.

          -how much of the database is "hot" matters more than the size of the database itself. We have seen applications that use databases that range from quite small to terabytes.

          You may want to look at com.sleepycat.je.util.DbCacheSize to get some guidelines on cache sizing, and to do some estimates of your own. Please also read the getting started guide, in particular this section: http://www.oracle.com/technology/documentation/berkeley-db/je/GettingStartedGuide/logfilesrevealed.html


          • 2. Re: Maximum DB Size?
            Presumably there is some internal id for a database that isn't its name, and that id has a maximum (or is a variable length encoding?). While I will only have a few thousand databases at any one time, I'll be regularly creating and removing them. May I assume that over the lifetime of an environment I should be able to create and destroy more than 2^32 databases?
            • 3. Re: Maximum DB Size?
              Charles Lamb
              Actually, there is a limit of 2^31 non-replicated databases and 2^31 replicated databases. You can subvert that doing a dump and reload to a fresh environment.

              Charles Lamb
              • 4. Re: Maximum DB Size?
                Just to be clear, that 2^31 limit is over the lifetime of an environment, not a limit for the number that exist at any one time?

                Knowing this, I can still leverage the benefits of truncateDatabase by recycling databases after emptying them. Good to know.
                • 5. Re: Maximum DB Size?
                  Charles Lamb
                  Yes, that is correct. See com.sleepycat.je.dbi.DbTree.lastAllocated*DbId.

                  Charles Lamb
                  • 6. Re: Maximum DB Size?
                    Charles Lamb
                    Ooops. Scratch that. truncate also allocates a new DbId.
                    • 7. Re: Maximum DB Size?
                      Assuming it could be optimized, a ranged delete operation would be a great addition. I want to delete 100 to 10k consecutive records at a time, on a continual basis. For now I'll proceed with a single database and individual record deletes. Thanks.
                      • 8. Re: Maximum DB Size?

                        I'm trying to read between the lines and have concluded that you're using key ranges instead of databases, because you can't create enough databases in total over the lifetime of your app. Correct?

                        Would support for 2^63 databases solve your problem? Not promising anything, just curious.

                        An optimized range deletion is a nice thing, and we should probably do it in the future. But because of JE's architecture I don't think it will ever be nearly as fast as a Database removal or truncation, which is already optimized.

                        • 9. Re: Maximum DB Size?
                          Also, what is the average size -- number of records, key/data sizes -- of each data set (what you'd like to store in each Database)? If the average size of a database is extremely small, the per-Database overhead may be a big factor.

                          • 10. Re: Maximum DB Size?
                            Yes, 2^63 databases would work.

                            2^31, given a 3 year hardware lifecycle, and a not unreasonable expected write rate (and therefore expected database turnover) would make using truncateDatabase too risky (within a safety factor of 2x). Alternatively, if the max database id were exposed in stats or something, we could know when it was getting close and initiate an automated re-provisioning process (where the host is wiped and data re-replicated back in).

                            For the purposes of discussion our records are keyed by "writer id" + "per writer sequence number". After accumulating so much data per writer, it gets moved elsewhere (out of BDB) and deleted, while additional writes happen at the tail. Deleting that as efficiently as possible is preferred. Given the current apis, that translates to using optimally 2 databases per "writer id" (the one we just overflowed and will soon delete, and the one we are now writing to).

                            However, another concern was the FAQ entry about checkpoint overhead (more than a magnitude worse) when using multiple databases. Our use case would have multiple writes to each db, so wouldn't be as pathological. I was going to write some test code for that scenario to see what it looks like.
                            • 11. Re: Maximum DB Size?
                              Key size: 32 bytes
                              Avg data size: ~150 bytes
                              Avg records per database: ~5000
                              • 12. Re: Maximum DB Size?
                                Charles Lamb
                                We could easily expose the max database id in the stats. I will also give you a quick and dirty hack to obtain it from the Environment, although we would not guarantee that it would be a supported api in future releases.

                                import com.sleepycat.je.DbInternal;

                                Environment env = ...'

                                should do the job for you.

                                Charles Lamb
                                • 13. Re: Maximum DB Size?
                                  I'm not sure whether the small size of your databases, and the per-database overhead including checkpointing, will outweigh the advantages of database removal over record removal. You're wise to test this for your particular parameters.

                                  • 14. Re: Maximum DB Size?
                                    Nice. That will let us monitor it, and engineer a work around if need be. I'm passing this info around internally, as I know at least one other team was considering doing something similar.

                                    Is the checkpoint overhead correlated with the number of databases that ever existed, that exist now, or that had activity since the last checkpoint?
                                    1 2 Previous Next