The documentation you referenced is accurate. I'm a little bit confused about what you're trying to do... If ClusterA members are binding to internal addresses won't they be unreachable from ClusterB?
Please take note of the interal and global IP addresses. That is the key to my issue.
Say ClusterA memebrs have following IP addresses:
internal IP: 192.168.0.1
global IP: 10.0.0.1
internal IP: 192.168.0.2
global IP: 10.0.0.2
And Cluster B members having IP addresses:
global IP: 10.0.0.3
global IP: 10.0.0.4
Due to the nature of network design, internal IP addresses from ClusterA are NOT accessible from ClusterB, however global IP addresses can be accessed from ClusterB. Further, members of ClusterA can only communicate amongs each other using the internal IP as the global IP is NATt'ed further up the network chain.
So when a member of ClusterB accesses the NameService from ClusterA (via clusterport or the Specific Network Interface), it's given the internal IP addresses of members from ClusterA. Then it attemps to connect to a member in ClusterA, and fails to do so because the IP address it has received from the NameService is an internal IP.
See below for a snippet of the ClusterB member trying to connect to ClusterA member
2019-12-10T21:09:24,637 INFO [Logger@9265725 220.127.116.11.0][Coherence] (thread=DistributedCache:DestinationController[ClusterA]Worker:0, member=2) Connecting to service FederatedCache at participant ClusterA with address tmb://192.168.0.1:9000.456
2019-12-10T21:09:31,655 INFO [Logger@9265725 18.104.22.168.0][Coherence] (thread=SelectionService(channels=24, selector=MultiplexedSelector(sun.nio.ch.EPollSelectorImpl@6c1a476e), id=1813665703), member=2) Disconnected from destination ClusterA
Ok, so if I'm following correctly, Cluster A members cannot bind directly to the global IP addresses (e.g. 10.0.0.1) as those are NAT addresses? In that case would it be possible to use another port in each Cluster A member for federation only? It is possible to specify a specific IP address and port number to bind to in the federated-scheme (or address provider with the IP address and port number defined in the Coherence operational config). And then in Cluster B, specify the global IP and the dedicated federation port. In that case when cluster B members attempt to connect to Cluster A, they will first attempt a NameService connection to the dedicated federation port, which will fail and then they will attempt a direct connection to the federation port, which should succeed.
This federation direct connection functionality was added just recently in Coherence - versions 22.214.171.124.y (where y is 3 or higher) and 126.96.36.199.0 and later.
To see an example of how this works, take a look at the cache and operational configuration (and use of system properties) in https://github.com/coherence-community/coherence-demo#enable-federation-on-kubernetes
Hope this helps,
Yes and yes to both questions. I'll give this kubernetes example a crack.
Meanwhile, what we have done is map the internal IP addersses of ClusterA members in the ClusterB box via iptables, works seamlessly.
Can you please give example operational and override config xmls for this:
"It is possible to specify a specific IP address and port number to bind to in the federated-scheme (or address provider with the IP address and port number defined in the Coherence operational config). And then in Cluster B, specify the global IP and the dedicated federation port."
Are you referring to this section of the operational config xml? From https://github.com/coherence-community/coherence-demo/blob/master/src/main/resources/cache-config.xml
Alternatively, is it possible to write a custom NameService that would return the equivalent global IP of a member? Is this advised?
I sort of combined the instructions from https://docs.oracle.com/en/middleware/fusion-middleware/coherence/188.8.131.52/administer/federating-caches-clusters.html#GUID-2D356F18-34A4-4F4F-A597-C1E1C8A4DB82 and https://github.com/coherence-community/coherence-demo/blob/master/src/main/resources/cache-config.xml .
Setup an address provider in the override xml and referenced this address provider in the operational config file under the federated scheme. The remote cluster can then hit this address provider via TCP (not NameService). So now working seamlessly.
Thanks for the help!
Glad it worked out!