2 Replies Latest reply on Aug 7, 2012 5:15 PM by 951211

    Large database bulk inserts

      Hi, I'm having some issues with inserting 100MM records to a JE database. I estimated my necessary cache size with the analysis utility to be 8 GB for internal nodes. However, with an 8 GB cache I see very fast inserts until the full database size on disk reaches 8 GB. After this point, inserts become extremely slow despite constantly high CPU usage. If I set the cache size to 20 GB (enough for all leaf nodes according to the utility), I see very fast inserts to the full 100MM with no slowdown.

      I assume my memory cache should not need to be equal to the full size of my database on disk? Perhaps my keys are not hashing efficiently?

        • 1. Re: Large database bulk inserts

          Since your cache is sized to fit internal nodes, it is most efficient to discard LNs immediately by configuring CacheMode.EVICT_LN. Without this, LNs are kept in cache until the cache is full, and then evicted. The eviction and resulting Java GC is very expensive and may be what is using CPU time and slowing down the inserts.

          To find out what is really happening with JE (for example, at what point eviction begins) you can look at the EnvironmentStats. You can also log GC information to see what's happening with that.

          • 2. Re: Large database bulk inserts
            Great thanks, that did it.