We continue to have this problem. We created a nagio process that alerts us as soon as the problem happens so that we can restart the server.
This issue continues to cause a lot of problems with corruption of user data.
You said in a a previous post that "this a critical problem."
Based on that, if it was me I would find the problem in the VM.
If it a a java code problem then you can run your servers with replaced code via a bootpath option.
If it is in C/C++ code then it requires more work to patch but it should be possible.