We have a Open MQ (version 4.5.2) HA Cluster using Oracle EE 11g as its store.
All our JMS clients are JCAPS 5.1.3 jcd's.
In some cases, when we arubtly stop another half of the cluster violently (ie. using kill -9) we sometimes experience message loss. Lost messgaes are in no queue. When we kill both halves in short time, we experience message loss practically every time.
Log's seem to indicate that the messages are lost in a XA tranaction that reads one message ffrom a topic or a queue, and writes one or more messages to another queue.
Any tips on how to debug this? or how to fix it event?
We have already tried using several different configurations reagarding threadpool models, implicit caching and client acknowledgement (AckOnAcknowledge, AckOnProduce) configurations.