Noticing an error in the current production environment which has close 100+ members below error occurs often , eventually client is not able to get connection
019-11-12 09:49:56.915/2686471.024 Oracle Coherence GE 220.127.116.11 <D5> (thread=Cluster, member=148): MemberLeft notification for Member(Id=7, Timestamp=2019-10-21 05:40:39.216, Address=XXXXX:8100, MachineId=52063, Location=site:,machine:XXXX:69437,member:proxy-nod
-12-5, Role=proxy) received from Member(Id=11, Timestamp=2019-10-12 07:33:45.36, Address=XXXX:8088, MachineId=30422, Location=site:,machine:XXXXX,process:130713,member:proxy-node-2-3, Role=proxy)
com.tangosol.net.messaging.ConnectionException: could not establish a connection to one of the following addresses: [XXXX:19099, XXXXX:19099, XXX:19099]; make sure the "remote-addresses" configuration element contains an address and port of a running TcpAcceptor
Member looses the connetion , somewhat client is still getting the lost member to make connection result in connection exception.
1) Why member leaves cluster often
2) why client gets stale member list to make connection results in Connection Exception.
Using coherence version - 18.104.22.168
Member 7 was a proxy node and left the cluster.
The ConnectionException is from a client that can't find a running proxy server to connect to in its configured list of proxy servers - so none of them are running apparently (e.g. maybe the all left the cluster).
Why proxy servers are leaving the cluster is what you need to diagnose. Are they running out of memory? Do you have GC logging turned on? Do you have JMX Reporter turned on? You need monitoring information to analyze causes of behavior like this.
Have you created a Service Request with Oracle Support?