I am investigating using Coherence in-memory data grid in Amazon Web Services.
I want to solve for system resilience and data durability when there is a complete failure in one AWS Availability Zone (AZ). We will have Coherence distributed cache nodes in multiple AZs. Is there a way to guarantee that every object in the cache is stored on at least 2 servers across different availability zones?
I believe if you configure the site-name,rack-name and machine-name properly, Coherence will pick the strongest backup strategy. i.e. store backup partition on different site provide you have site-name configured properly on all nodes.
The default backing strategy that comes out-of-the-box will not help you. You would need to use the simple partition assignment strategy that will ensure "data safety", by placing primary and backup copies of data on different sites, racks and machines. So if there are multiple sites then primary and backup data will be put on different sites. You need to define the following parameters for JVMs on each AZ:
-Dtangosol.coherence.site=AZ1 or -Dtangosol.coherence.site=AZ2
I believe that using site and rack name will prioritise assignment of backups to alternate site/rack, but will not guarantee it - e.g. if there is asymmetry in capacity between sites, some partitions will be backed up on the same site.