What I meant is being able to instruct DatabaseEntry to create (or use) an array of length specified and use partial interface when returning key contents to the caller of getSearchKey, getSearchKeyRange, etc methods. I will reuse DatabaseEntry object (one per thread) instead of byte array then.
However, I see that "since the entire record is always read or written to the database, and the entire record is cached" (per doc) - then probably there is a need to create new byte array to be put in cache?
In our tests we have not found short lived objects to be a cause of GC problems, so this sort of thing is not currently on our TODO list. Are you certain this is the cause of the GC problems you're seeing? We only see serious GC problems due to long lived objects.
Oh no! It is not _the_problem_, thats why I called it 'feature'. There are many systems working together and 95-99% of allocations and garbage collections according to profiler are those buffers and there is a room for improvement. That could lead to memory fragmentation on long runs (just yesterday I've restarted an instance of java with 72 days uptime) because all of those buffers have different sizes (good thing that my keys are all short).
If you are busy with bugs and more important things - then sure it should not be on your list. This feature would help performance claim, anyway - Java File API and others allow you to reuse buffers straight away, and it is just a little strange that performance oriented BDB JE low-level API of it do not byte buffer reuse. People who don't care about this things are likely not to know what embedded DB is and are using JDBC/Mysql something 8-) Kidding.
Sorry. And thanks for great work anyway!
However, "private final DatabaseEntry key = new DatabaseEntry(32768);
private final DatabaseEntry value = new DatabaseEntry(32768);
That will pre-allocate buffer, reuse it for reads and require me to use getPartialXXX methods.
That will fail or read first bytes (whatever) when buffer is too small to accommodate the data." would be very nice 8-)
...Or have derived DatabaseEntryReusable class with the only overridden method 'allocate(x)' that will check capacity and configure 'partialLength' instead of actual allocation.
It's interesting that the key/data byte arrays returned by JE are the dominant GC waste in your app. Is the entire data set resident in the JE cache? If so, I can understand why these allocations stand out.
So far we've been most concerned with GC waste due to JE eviction -- when the data set does not fix in cache -- because these are long lived objects.
The dataset is not in the cache. On production server the size of environment is 60G (on testing it was 21G) the java memory is 8G (on testing is was 4G) and JE cache size is 50% memory through config.setCachePercent() method. Cache is shared, however in that case there is just one environment.
Objects and primitive values are created by 4-16 worker threads based on data in that dataset. Objects are also cached, but more than half of read operations is randomly distributed along dataset and accessed once in a longer-than-cached while.
Objects that are created do have different types and different lifespans. But byte buffer arrays allocated by JE read operations (get and range (cursor.next/prev)) are always not needed right after it was compared and /likely/ used to create an object by the worker thread straight away, in the same loop where JE read happens.
Number of byte array allocations is even bigger (sometimes comparisons get conditions when they need to skip a row or may find given object created elsewhere (like: reference to another object)) than a sum of all objects and primitives created during data access. Also byte arrays are same type and grouped like one class in the profiler report. That's why java profiler was showing it to me as the only visible issue with allocations/memory_utilization/gc that is by far greater than others. I am not sure that I do experience (or can measure and detect) any problems because of it, apart from profiler reports. And, anyway, testing and profiling is never anything like I experience in production ever 8-)