This discussion is archived
3 Replies Latest reply: Dec 27, 2012 12:04 AM by EJP RSS

RMI Registry - Network Related Issues

981567 Newbie
Currently Being Moderated
Hi,

It has been observed and noted in in a Solaris 10 environment with IP Multi-Pathing (IPMP) configured on the Servers running RMI Registries and some java programs(components) which communicate with each other remotely, that when IPMP is disrupted, RMI Communications fails.

For instance, the following IPMP message are recorded in "/var/adm/messages":

Cannot meet requested failure detection time of 10000 ms on (inet nxge0) new failure detection time for group "netMultiNICB" is 101234 ms
Improved failure detection time 50617 ms on (inet e1000g0) for group "XXXXXX"
Improved failure detection time 25308 ms on (inet nxge0) for group "XXXXXX"
Improved failure detection time 12654 ms on (inet e1000g0) for group "XXXXXX"
Improved failure detection time 10000 ms on (inet nxge0) for group "XXXXXX"

...when IPMP disruption occurs

RMI registry shows some weird behavior--

1) If we run another registry we get -

java.rmi.ConnectIOException: error during JRMP connection establishment; nested exception is:
java.net.SocketException: Socket is not connected

2) The Skeleton lists that had been previously setup/defined in running the JAVA Component Services allow for Component to Component communications to continue

3) We have a utility to check if any of the components went down or not working properly (like listing the list of all remote objects bound with the registry/ calling a remote method ping() which will just return true etc.). This utility throws this exception when trying to access that list of bound java programs :

java.rmi.ConnectException: Connection refused to host: <I.P.>; nested exception is:
java.net.ConnectException: Connection timed out



I couldn't find any information on network related issues with RMI (specially in case of IPMP). I am really stuck at this point.

Note: Any such issue is never observed in non-IPMP setup.
  • 1. Re: RMI Registry - Network Related Issues
    EJP Guru
    Currently Being Moderated
    I couldn't find any information on network related issues with RMI (specially in case of IPMP). I am really stuck at this point.
    RMI doesn't do anything startling with the network. Any other Java TCP client or server would experience the same failures. There's no reason to confine your search to RMI, or even to Java.
    Note: Any such issue is never observed in non-IPMP setup.
    Then IPMP is what you should be investigating, not RMI.
  • 2. Re: RMI Registry - Network Related Issues
    981567 Newbie
    Currently Being Moderated
    Hi EJP,

    Thanks for the quick response.

    All the java components are on different machines (solaris) and bind with the registry via UDP multicast.

    Questions :

    1) How come the java components keep communicating? Because even they call remote methods on each other without any fail.

    2) If there is any network interruption in between the RMI registry and the remote component, could it be possible that the skeletons in the list are no longer bound with the RMI registry?

    3) Its fine that such exceptions are being observed when there are network interruptions........ but even after its back to normal such exceptions are still being observed.

    Just trying to figure out what happens with the registry in case of heavy network interruptions.

    Thanks,
    Harsh
  • 3. Re: RMI Registry - Network Related Issues
    EJP Guru
    Currently Being Moderated
    All the java components are on different machines (solaris) and bind with the registry via UDP multicast.
    RMI components bind to the RMI Registry via TCP unicast.
    1) How come the java components keep communicating? Because even they call remote methods on each other without any fail.
    I don't know what failures you were expecting so I can't answer that.
    2) If there is any network interruption in between the RMI registry and the remote component, could it be possible that the skeletons in the list are no longer bound with the RMI registry?
    I don't understand the question. The only communication between the RMI Registry and remote objects occurs at bind time or else it consists of DGC calls. There are no skeletons in an RMI Registry, only stubs. The only way a stub can no longer be bound is if someone unbinds it or the Registry is restarted.
    3) Its fine that such exceptions are being observed when there are network interruptions........ but even after its back to normal such exceptions are still being observed.
    Doesn't that mean there is still a network problem?
    Just trying to figure out what happens with the registry in case of heavy network interruptions.
    Nothing special. It's just a TCP server really.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points