since we deployed the EM-Agent (18.104.22.168) on RedHat 6.4 our system hangs on reboot (init 6) or complete shutdown (init 0).
If we stop the agent manually before rebooting the system, all works fine.
We executed the root.sh script after deploying the Agent, so the scripts gcstartup, unlockgcstartup and lockgcstartup are present in /etc/init.d and also the softlinks in rc2.d rc3.d and rc5.d are existent.
The redhat-system runs in runlevel 3.
Any ideas, why the agent seems to block rebooting the system?
/var/log/messages contains only startup information...
We opened a console window to figure out what happens during reboot.
Two unusal results:
1. Unmounting NFS filesystem: unmount.nfs: <our_PATH>: device is busy [FAILED]
Unmounting NFS filesystem (retry): [OK]
2. Last output is "Please stand by while rebooting the system...", from this the system hangs.
I tested again with stopped agent before rebooting -> the "device is busy" error doesn't occur and the system is rebooting as expected.
Looks to me that your system hang at reboot doesn't have anything to do with the EM agent, but is due to an issue with NFS. You are perhaps mouting an NFS share usng "hard" and "nointr", which requires a hard reset of the machine if there is a probelm between your machine and the NFS server, similar to a hard disk erorr that cannot be resolved. You might also want to check in which order shutdown items are being processed to make sure not to unmount any NFS share while it is still being used by some software.
thanks for your reply Dude.
Yes, we are using the "hard" an "nointr" option as recommended with MOS Doc ID 359515.1
(The Agent software is installed on the same mountpoint where the database binaries are installed.)
You are right, the shutdown-order influences the behavior. We created softlinks in /etc/rc6.d
ln -s /etc/init.d/gcstartup /etc/rc6.d/K01gcstartup
ln -s /etc/init.d/unlockgcstartup /etc/rc6.d/K02unlockgcstartup
with these softlinks the reboot works fine.
Well, the root.sh script creates links in /etc/rc3.d for Agent-Start. But no links in /etc/rc0.d and /etc/rc6.d for stopping the agent are created.
During reboot /etc/rc6.d/S00killall tries to stop the agent, but it seems to be to late...
I did a test with OL 5.9 instead of RHEL 6.4.
The notice "Unmounting NFS filesystem: unmount.nfs: <our_PATH>: device is busy [FAILED]" is displayed too, but the system reboots as expected. (Same nfs-options...)