Hey guys, as you may know, we (LinkedIn) are using BerkeleyDB Je as the data layer for Project Voldemort, a distributed key-value store (note: we are not using BDB-HA/Replication features). Recently, we've upgraded the version of BerkeleyDB used by Voldemort from 4.0.90 to 4.1.7. The get latency didn't change, but we've noticed an increase in put latency.
We've observed this issue on multiple nodes.
You can see a graph of this behaviour here: http://behemoth.strlen.net/~alex/bdb417_latency.png
1) We upgraded to bdb4.1.7 on the node indicated by a green line, and as you can see latency rose
2) We downgraded bdb to 4.0.90, latency dropped back down.
3) We upgraded to bdb 4.1.7 on another node (blue line), the latency rose again
4) We downgraded the node indicated by the blue line again, the latency dropped yet again
Has anyone else observed this behaviour?
Your graph shows a pretty significant jump in latency, and it's not something that we expect.
Would you happen to have, or could you get, a sampling of BDB environment statistics when this happens? You'd get these by making periodic cals to Environment.getStats , being sure to set StatsConfig.setClear(true) so you get interval statistics. The stats can also be obtained via JConsole if you have the JMX mbean enabled, as specified in http://download.oracle.com/docs/cd/E17277_02/html/jconsole/JConsole-plugin.html.
Cache miss counts are usually the first thing we look for, when we see performance changes like this. The stats would provide a starting point to start understanding the difference in behavior.