4 Replies Latest reply on Feb 10, 2014 11:08 AM by Pascal Kreyer-Oracle

    How to find root cause for server hang

    Sumit_the_Admin

      Dec 31 21:05:51 kyorns052 vxio: [ID 339115 kern.notice] NOTICE: VxVM vxio V-5-3-1437 Volume syblog of sybasedg: Disabling detach map in dco

      Dec 31 21:05:51 kyorns052 vxio: [ID 339115 kern.notice] NOTICE: VxVM vxio V-5-3-1437 Volume sybmaster of sybasedg: Disabling detach map in dco

      Jan 21 14:53:48 kyorns052 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_147441-27 64-bit

       

      Also, if we check the engine log on ,  there was no message in between 1st Jan until 21st Jan when the vcs engine had started on 21st Jan(on reboot)

       

      2014/01/01 16:11:56 VCS INFO V-16-1-53504 VCS Engine Alive message!!

      2014/01/21 14:56:12 VCS INFO V-16-1-10196 Cluster logger started

      2014/01/21 14:56:12 VCS NOTICE V-16-1-11022 VCS engine (had) started

      2014/01/21 14:56:12 VCS NOTICE V-16-1-11050 VCS engine version=5.1

      2014/01/21 14:55:33 VCS WARNING V-16-1-11141 LLT heartbeat link status changed. Previous status = oce1 DOWN oce8 DOWN oce0 DOWN; Current status = oce1 UP oce8 UP oce0 UP.

       

      From all these logs, I am suspecting that the server might be  down/hung  that can be  the reason behind no response on any of the three heartbeat (llt ) links.

      It is impossible to tell what made kyorns052 down/hung etc., as there were no logs. It was more of an abrupt down/hung like it happens  in case of power outage.

       

      How can we find the root cause of server hang as it seems there are no messages in message file

      This is HP Blade server on which solaris 10 is installed