Forum Stats

  • 3,780,846 Users
  • 2,254,447 Discussions
  • 7,879,483 Comments

Discussions

Federation ActiveActive Topology: Replication delay backing-up local writes

Chris San Buenaventura
Chris San Buenaventura Member Posts: 17 Green Ribbon
edited Jan 9, 2020 9:58PM in Coherence Support

Coherence version: 12.2.1.4.0

We are using ActiveActive federation across two clusters (one cluster in London and one cluster in New York). Today we had prolonged network glitch which slowed down communication between our London and New York servers.

What we have observed is due to the cross-Atlantic network delay, local writes to a cluster were getting backed up as well. Can you please check if this is a bug or expected behaviiour? If it is the latter then, is there anyway we can configure Federation so that the replication flow does not back-up local writes?

We were getting below logs like below during the network glitch:

2020-01-05T23:41:08,356 WARN  [[email protected] 12.2.1.4.0][Coherence] (thread=SelectionService(channels=7, selector=MultiplexedSelector([email protected]), id=150693841), member=4) tmb://10.53.200.125:9300.45391 accepted connection migration with tmb://10.53.200.126:9300.54452 on MultiplexedSocketChannel(MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52414]}): peer=tmb://10.53.200.126:9300.54452, state=ACTIVE, socket=MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52414]}, migrations=6, bytes(in=13683513, out=23235313), flushlock false, bufferedOut=6.95KB, unflushed=0B, delivered(in=54117, out=58173), timeout(ack=7.49s), interestOps=1, unflushed receipt=0, receiptReturn 0, isReceiptFlushRequired false, bufferedIn(), msgs(in=28914, out=29393/29412)

java.io.IOException: ack timeout after 15s

        at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.checkHealth(BufferedSocketBus.java:890)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5.lambda$run$0(AbstractSocketBus.java:644)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5$$Lambda$208/626754434.accept(Unknown Source)

        at java.util.concurrent.ConcurrentHashMap$ValuesView.forEach(ConcurrentHashMap.java:4707)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$5.run(AbstractSocketBus.java:644)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$3.run(AbstractSocketBus.java:426)

        at com.oracle.common.internal.net.RunnableSelectionService.processRunnables(RunnableSelectionService.java:533)

        at com.oracle.common.internal.net.RunnableSelectionService.process(RunnableSelectionService.java:349)

        at com.oracle.common.internal.net.RunnableSelectionService.run(RunnableSelectionService.java:274)

        at com.oracle.common.internal.net.ResumableSelectionService.run(ResumableSelectionService.java:133)

        at java.lang.Thread.run(Thread.java:745)

2020-01-05T23:41:33,974 WARN  [[email protected] 12.2.1.4.0][Coherence] (thread=SelectionService(channels=15, selector=MultiplexedSelector([email protected]), id=505818397), member=4) tmb://10.53.200.125:9300.45391 accepted connection migration with tmb://10.53.200.126:9300.54452 on MultiplexedSocketChannel(MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52530]}): peer=tmb://10.53.200.126:9300.54452, state=ACTIVE, socket=MultiplexedSocket{Socket[addr=/10.53.200.126,port=9300,localport=52530]}, migrations=7, bytes(in=13766783, out=23336481), flushlock false, bufferedOut=14.4KB, unflushed=0B, delivered(in=54389, out=58422), timeout(ack=2.13s), interestOps=1, unflushed receipt=0, receiptReturn 0, isReceiptFlushRequired false, bufferedIn(), msgs(in=29054, out=29524/29537)

java.io.IOException: Connection reset by peer

        at sun.nio.ch.FileDispatcherImpl.readv0(Native Method)

        at sun.nio.ch.SocketDispatcher.readv(SocketDispatcher.java:43)

        at sun.nio.ch.IOUtil.read(IOUtil.java:278)

        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:435)

        at com.oracle.common.internal.net.WrapperSocketChannel.read(WrapperSocketChannel.java:130)

        at com.oracle.common.internal.net.MultiplexedSocketProvider$MultiplexedSocketChannel.read(MultiplexedSocketProvider.java:1547)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$Connection.read(AbstractSocketBus.java:1956)

        at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.read(BufferedSocketBus.java:93)

        at com.oracle.common.internal.net.socketbus.SocketMessageBus$MessageConnection$ReadBatch.read(SocketMessageBus.java:615)

        at com.oracle.common.internal.net.socketbus.SocketMessageBus$MessageConnection.processReads(SocketMessageBus.java:206)

        at com.oracle.common.internal.net.socketbus.BufferedSocketBus$BufferedConnection.onReadySafe(BufferedSocketBus.java:700)

        at com.oracle.common.internal.net.socketbus.AbstractSocketBus$Connection.onReady(AbstractSocketBus.java:2135)

        at com.oracle.common.internal.net.RunnableSelectionService.process(RunnableSelectionService.java:401)

        at com.oracle.common.internal.net.RunnableSelectionService.run(RunnableSelectionService.java:274)

        at com.oracle.common.internal.net.ResumableSelectionService.run(ResumableSelectionService.java:133)

        at java.lang.Thread.run(Thread.java:745)

Tagged:
Chris San Buenaventura

Best Answer

  • Randy Stafford-Oracle
    Randy Stafford-Oracle Member Posts: 21
    edited Jan 8, 2020 12:03AM Accepted Answer

    Hi Chris,

    Back to My Oracle Support again, I'd like to request that you create a Service Request for this.  That will allow us to collect relevant information, engage the correct product engineers, etc.  Could you please do that and let me know the SR number?

    Thanks,
    Randy

Answers