I've just implemented a backing map listener that when it receives regular events it send them to another cache ( the cache is in another service, I use an entryprocessor). Currently I am doing it asynchronously using a standard java BlockingQueue because doing it synchronously hits the performance (even if I put several threads). The problem is that when a node is turned off, even for regular switch off (no kill -9), some writes are missing because the destination service is not available at some point.
I've tried to use a custom backup storage with a listener that stores the latest events so that I can replay the latest messages in case of a MemberListener events. But the problem is that the backup listener gets lots of "non regular" events (all the events are not regular in the backup, but what I mean is that I cannot differentiate regular ops that are stored in the backup from rebalancing of the backup)
Any idea if there is a way of doing this in a proper way?
Another idea that I've been thinking is to use a InvocableService with the async method execute(). But I don't know the guarantees for the pending invocations in case of a normal coherence node switch off.
Take a look at CacheStores - ideally BinaryCacheStore. You can configure the CacheStore to be invoked asynchronously by setting the write-delay to a non-zero value in the cache config e.g. to 1ms.
What you do in your CacheStore impl is up to you, so you can take you existing EntryProcessor and invoke it from your cache store without issue (just make sure you have enough worker threads on your source service to handle your load.
The benefit of using CacheStores is that the thread doing the call back is a Coherence worker thread. Calls to cache stores come with a guarentee that if the node fails before the CacheStore has been invoked or returned then the CacheStore will be invoked for you on the backup node once the partition has been promoted to a primary - i.e. it offers the fault tolerance you're looking for.
We use this pattern in Production and it works very well.
The main caveat with this approach is that multiple updates to a single entry within the write-delay are lost i.e. you only get a single call to your CacheStore with the latest value - intermediate values are lost. If this is acceptable to you, as it is to us, then this pattern works well.