1 Reply Latest reply: Apr 8, 2013 1:35 PM by Fgutierrez-Oracle RSS

    Solaris fmadm not working properly. But logs are creating /var/fm/fmd.

    960595
      Hi Guys,

      I have a problem in solaris fmd. Error logs are creating in /var/fm/fmd . But no new event is showing neither in fmadm faulty nor fmdump command. I will glad if someone can help me resolve this issue.

      Thank you again.
      Abilash

      Logs :-

      root@CND # date
      Sun Apr 7 10:12:07 IST 2013
      root@CNDAUNOCAPZP64 # pwd
      /var/fm/fmd
      root@CND # ls -l
      total 2745076
      drwx------ 4 root sys 512 Apr 7 09:48 ckpt
      -rw-r--r-- 1 root root 25957242 Apr 7 10:12 errlog
      -rw-r--r-- 1 root root 0 May 3 2012 errlog-
      -rw-r--r-- 1 root root 379354310 Apr 7 03:10 errlog.0
      -rw-r--r-- 1 root root 393676650 Apr 6 03:10 errlog.1
      -rw-r--r-- 1 root root 404880730 Apr 5 03:10 errlog.2
      -rw-r--r-- 1 root root 198352470 Apr 4 03:10 errlog.3
      -rw-r--r-- 1 root root 0 Apr 7 01:05 errlog.4
      -rw-r--r-- 1 root root 0 Apr 7 01:05 errlog.5
      -rw-r--r-- 1 root root 0 Apr 7 01:06 errlog.6
      -rw-r--r-- 1 root root 0 Sep 7 2012 errlog.7
      -rw-r--r-- 1 root root 0 Sep 7 2012 errlog.8
      -rw-r--r-- 1 root root 0 Sep 7 2012 errlog.9
      -rw-r--r-- 1 root root 1668468 Apr 7 09:54 fltlog
      drwx------ 2 root sys 808448 Apr 7 09:54 rsrc
      drwx------ 2 root sys 512 Jan 29 2010 xprt
      root@CND #
      root@CND # fmadm faulty
      --------------- ------------------------------------ -------------- ---------
      TIME EVENT-ID MSG-ID SEVERITY
      --------------- ------------------------------------ -------------- ---------
      May 11 2011 e12d1425-49f2-c077-c30f-9dcbfb29a155 FMD-8000-2K Minor

      Host : CND
      Platform : SUNW,Sun-Fire Chassis_id :
      Product_sn :

      Fault class : defect.sunos.fmd.module
      Affects : fmd:///module/cpumem-diagnosis
      faulted and taken out of service
      FRU : None
      faulty

      Description : A Solaris Fault Manager component has experienced an error that
      required the module to be disabled. Refer to
      http://sun.com/msg/FMD-8000-2K for more information.

      Response : The module has been disabled. Events destined for the module
      will be saved for manual diagnosis.

      Impact : Automated diagnosis and response for subsequent events associated
      with this module will not occur.

      Action : Use fmdump -v -u <EVENT-ID> to locate the module. Use fmadm
      reset <module> to reset the module.

      root@CND # fmdump
      TIME UUID SUNW-MSG-ID
      root@CND # fmdump -v
      TIME UUID SUNW-MSG-ID

      root@CND# tail -50 /var/adm/messages
      Apr 7 07:59:36 CND last message repeated 1 time
      Apr 7 08:02:04 CND sgsbbc: [ID 428960 kern.notice] NOTICE: Unable to send ECC event message to System Controller
      Apr 7 08:09:55 CND last message repeated 1 time
      Apr 7 08:14:46 CND sgsbbc: [ID 428960 kern.notice] NOTICE: Unable to send ECC event message to System Controller
        • 1. Re: Solaris fmadm not working properly. But logs are creating /var/fm/fmd.
          Fgutierrez-Oracle
          Good day,


          As a first recommendation,

          * Ensure the latest firmware/patches for FMA are installed on the system.

          * Then, reset and restart the module.

          The error. FMD-8000-2K

          Solaris Fault Manager component had disabling error . ID 1021148.1

          "
          This can indicate a defect in the module or the Fault Manager.

          The recommended actions for the system administrator are as follows:

          Identify the affected module ( # fmdump -v -u <event-id> )

          Attempt to reset and restart the module. ( # fmadm reset <module> )

          If the reset fails, try removing the module checkpoint data file before attempting to restart:

          # cd /var/fm/fmd/ckpt/
          # rm <module>/*
          "


          * About the msg "NOTICE: Unable to send ECC event message to System Controller"

          it may be caused by flood of errors ( for ex a memory dimm )