We are having problem with NFS on servers from a cluster. The error is:
Sep 14 09:53:45 rduxsr06 Had: [ID 702911 daemon.notice] VCS ERROR V-16-1-7005 (rduxsr08) NFS:/opt/VRTSvcs/bin/NFS/monitor:???:RPC call to 100003 failed with status 5
This error is caused by the NFS daemon stop to answer, we can see on command:
root@rduxsr08 # rpcinfo -T udp rduxsr08 nfs
rpcinfo: RPC: Timed out
program 100003 version 0 is not available
I will explain our environment:
We had a cluster with 6 (node 06,07,08,09,10,11) nodes that mount file systems from a Storage and share these file systems to others servers from environment by NFS share.
We had to remove 2 servers from this cluster, and now we have only 4 nodes (06,08,09,10). The file systems from nodes 07 and 11, that were removed, were mounted on nodes 06 and 08.
This problem with NFS did not happened when we had 6 nodes. Now that we have 4, the problem is occuring.
This problem was resolved restarting the NFS daemon from server (node) that presented the problem with NFS.
Ok, we opened an ticket on Sun support and they called us to apply some patches, because our patches were too old. Ok, we applied these patches that they recommend, and now our servers are with patch version:
But, after two days that we applied the patches, the problem back to occur. But now, after the problem starts, a few minutes, like 10, 20 minutos, the problem is resolved automatic. The NFS daemon back to respond by itself.
Do anyone see this problem? Or know to resolv this?