This content has been marked as final. Show 8 replies
Managed servers attempt to connect to the admin server at startup and periodically thereafter. To be successful, the correct admin server listen address, listen port, protocol, and security credentials need to be specified. Usually connection issues such as this can be traced to a issue in one of these areas. Depending on how you are starting the server, this may be specified in a startup command file or in the config.xml.
Look at the managed server logs. You should see connections to the admin server being attempt being attempted, and those log messages include details on the listen port and address attempted. If they are correct, look at the logs for any unexpected exceptions from those attempts.
If your servers are up and healthy but an application deployed to those servers is unexpectedly down, looking into the logs is probably the best way to find the cause. You may see an issue reported on the console if you select the app and try to start it, but it really depends on what the issue is and when it is detected.
Thanks Loren for the response.
The admin server listen address, port, protocol and security credentials are correct. If one was wrong, I would expect to see this issue sooner, and on all nodes.
The config.xml file is the same on all machines.
The server starts one of two ways:
1) Server Reboot - The nodemanger will start the nodes, using the nmStart(server) comment.
2) Manual Restart - using the Adminserver to restart the nodes.
I do know that this issue has come up after a server reboot. For example, I'll check it when I come in, in the morning and everything will be "OK", but come the afternoon, one or more nodes will not display "OK", but will still be Running and serving data.
It could be if server came out of cluster due to issue with cluster and connection between admin server and managed server broke. At such stage even if server is healthy, it will not present in admin server or will be shown as failed.
To start with did you first checked if your server is UP and taking request. This you can achieve by doing any of the below,
1. PING test to managed server. It will show if there are any packet loss.
2. Check server access log to confirm if it is serving request in the runtime.
3. "telnet" to the ip:port pair and also check if process id is alive.
Also check server logs for cluster error. If server is healthy but not presented in admin console, then only restarting the server will re-create communication between admin server and managed server which will then display correct status on console. And if server itself is down, go for server restart.