I have an application to simulate a multi-cluster environment. I have two grids running on two identical servers (24core, 96GB linux). Both grids run 6 nodes each. With write-delay, we can do 1500tps on grid1, and have grid1 replicating the data to grid2. However, the delayed replication amounts to about 300 tps. For instance, we run 1500tps for 500seconds, it will take 1500x500/300 seconds for all the entries to be replicated to the other grid. Our objects are small, and we’ve monitored the Ethernet traffic and calculated that the traffic is around 2KB out and 1KB in per transaction from the initiating grid.
This does not make sense to me. The two grids should be performing at nearly the same performance, with the push replication being the delay in the middle. Why would push replication perform so much slower? Where might the bottleneck be?
My guess is that push replication is single threaded per named cache on the destination grid. I wonder if anyone can confirm this and propose a work-around.