5 Replies Latest reply: Nov 12, 2012 11:34 AM by rukbat RSS

    Netra 440 Maintenance light on

    971153
      Hi All,

      I have a Netra 440 server which has been showing an amber maintenance light for some time now. It first appeared when I upgraded the memory modules, but the new memory is working fine. All other components seem OK and the machine is up and running. I am completely baffled as to what the problem could be. Prtdiag shows no problems:


      5ksh# ./prtdiag -v
      System Configuration: Sun Microsystems sun4u Netra 440
      System clock frequency: 177 MHZ
      Memory size: 16GB

      ==================================== CPUs ====================================
      E$ CPU CPU
      CPU Freq Size Implementation Mask Status Location
      --- -------- ---------- --------------------- ----- ------ --------
      0 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
      1 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
      2 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
      3 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -

      ================================= IO Devices =================================
      Bus Freq Slot + Name +
      Type MHz Status Path Model
      ------ ---- ---------- ---------------------------- --------------------
      pci 66 PCI5 SUNW,XVR-100 (display) SUNW,375-3290
      okay /pci@1c,600000/SUNW,XVR-100@1

      pci 66 MB pci108e,abba (network) SUNW,pci-ce
      okay /pci@1c,600000/network@2

      pci 33 MB isa/su (serial)
      okay /pci@1e,600000/isa@7/serial@0,3f8

      pci 33 MB isa/su (serial)
      okay /pci@1e,600000/isa@7/serial@0,2e8

      pci 33 MB isa/rmc-comm-rmc_comm (seria+
      okay /pci@1e,600000/isa@7/rmc-comm@0,3e8

      pci 33 MB pci10b9,5229 (ide)
      okay /pci@1e,600000/ide@d

      pci 66 MB pci108e,abba (network) SUNW,pci-ce
      okay /pci@1f,700000/network@1

      pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
      okay /pci@1f,700000/scsi@2

      pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
      okay /pci@1f,700000/scsi@2,1


      ============================ Memory Configuration ============================
      Segment Table:
      -----------------------------------------------------------------------
      Base Address Size Interleave Factor Contains
      -----------------------------------------------------------------------
      0x0 4GB 16 BankIDs 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      0x1000000000 4GB 16 BankIDs 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
      0x2000000000 4GB 16 BankIDs 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47
      0x3000000000 4GB 16 BankIDs 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63

      Bank Table:
      -----------------------------------------------------------
      Physical Location
      ID ControllerID GroupID Size Interleave Way
      -----------------------------------------------------------
      0 0 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      1 0 0 256MB
      2 0 1 256MB
      3 0 1 256MB
      4 0 0 256MB
      5 0 0 256MB
      6 0 1 256MB
      7 0 1 256MB
      8 0 1 256MB
      9 0 1 256MB
      10 0 0 256MB
      11 0 0 256MB
      12 0 1 256MB
      13 0 1 256MB
      14 0 0 256MB
      15 0 0 256MB
      16 1 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      17 1 0 256MB
      18 1 1 256MB
      19 1 1 256MB
      20 1 0 256MB
      21 1 0 256MB
      22 1 1 256MB
      23 1 1 256MB
      24 1 1 256MB
      25 1 1 256MB
      26 1 0 256MB
      27 1 0 256MB
      28 1 1 256MB
      29 1 1 256MB
      30 1 0 256MB
      31 1 0 256MB
      32 2 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      33 2 0 256MB
      34 2 1 256MB
      35 2 1 256MB
      36 2 0 256MB
      37 2 0 256MB
      38 2 1 256MB
      39 2 1 256MB
      40 2 1 256MB
      41 2 1 256MB
      42 2 0 256MB
      43 2 0 256MB
      44 2 1 256MB
      45 2 1 256MB
      46 2 0 256MB
      47 2 0 256MB
      48 3 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
      49 3 0 256MB
      50 3 1 256MB
      51 3 1 256MB
      52 3 0 256MB
      53 3 0 256MB
      54 3 1 256MB
      55 3 1 256MB
      56 3 1 256MB
      57 3 1 256MB
      58 3 0 256MB
      59 3 0 256MB
      60 3 1 256MB
      61 3 1 256MB
      62 3 0 256MB
      63 3 0 256MB

      Memory Module Groups:
      --------------------------------------------------
      ControllerID GroupID Labels Status
      --------------------------------------------------
      0 0 C0/P0/B0/D0
      0 0 C0/P0/B0/D1
      0 1 C0/P0/B1/D0
      0 1 C0/P0/B1/D1
      1 0 C1/P0/B0/D0
      1 0 C1/P0/B0/D1
      1 1 C1/P0/B1/D0
      1 1 C1/P0/B1/D1
      2 0 C2/P0/B0/D0
      2 0 C2/P0/B0/D1
      2 1 C2/P0/B1/D0
      2 1 C2/P0/B1/D1
      3 0 C3/P0/B0/D0
      3 0 C3/P0/B0/D1
      3 1 C3/P0/B1/D0
      3 1 C3/P0/B1/D1

      =============================== usb Devices ===============================

      Name Port#
      ------------ -----
      hub 2

      =============================== hub#2 Devices ===============================

      Name Port#
      ------------ -----
      keyboard 4
      mouse 1

      ============================ Environmental Status ============================
      Fan Status:
      -------------------------------------------
      Location Sensor Status
      -------------------------------------------
      FT0/F0 TACH okay
      FT1/F0 TACH okay
      FT2/F0 TACH okay
      FT3/F0 F0 okay
      PSU0 FF_FAN okay
      PSU1 FF_FAN okay
      PSU2 FF_FAN okay
      PSU3 FF_FAN okay

      Temperature sensors:
      -----------------------------------------
      Location Sensor Status
      -----------------------------------------
      C0/P0 T_CORE okay
      C1/P0 T_CORE okay
      C2/P0 T_CORE okay
      C3/P0 T_CORE okay
      PSU0 FF_OT okay
      PSU1 FF_OT okay
      PSU2 FF_OT okay
      PSU3 FF_OT okay
      ------------------------------------
      Current sensors:
      ----------------------------------------
      Location Sensor Status
      ----------------------------------------
      MB FF_SCSIA okay
      MB FF_SCSIB okay
      MB FF_POK okay
      C0/P0 FF_POK okay
      C1/P0 FF_POK okay
      C2/P0 FF_POK okay
      C3/P0 FF_POK okay
      PSU0 FF_OC okay
      PSU1 FF_OC okay
      PSU2 FF_OC okay
      PSU3 FF_OC okay
      ------------------------------------
      Voltage sensors:
      -----------------------------------
      Location Sensor Status
      -----------------------------------
      MB V_+1V5 okay
      MB V_VCCTM okay
      MB V_NET0_1V2D okay
      MB V_NET1_1V2D okay
      MB V_NET0_1V2A okay
      MB V_NET1_1V2A okay
      MB V_+3V3 okay
      MB V_+3V3STBY okay
      MB/BAT V_BAT okay
      MB V_SCSI_CORE okay
      MB V_+5V okay
      MB V_+12V okay
      MB V_-12V okay
      PSU0 P_PWR okay
      PSU0 FF_POK okay
      PSU0 FF_UV okay
      PSU0 FF_OV okay
      PSU1 P_PWR okay
      PSU1 FF_POK okay
      PSU1 FF_UV okay
      PSU1 FF_OV okay
      PSU2 P_PWR okay
      PSU2 FF_POK okay
      PSU2 FF_UV okay
      PSU2 FF_OV okay
      PSU3 P_PWR okay
      PSU3 FF_POK okay
      PSU3 FF_UV okay
      PSU3 FF_OV okay
      -----------------------------------------
      Keyswitch:
      -----------------------------------------
      Location Keyswitch State
      -----------------------------------------
      SYS SYSCTRL NORMAL
      --------------------------------------------------
      Led State:
      --------------------------------------------------------------
      Location Led State Color
      --------------------------------------------------------------
      SYS ACT on green
      SYS SERVICE on amber
      SYS LOCATE off white
      PSU0 POK on green
      PSU0 SERVICE off amber
      PSU0 OK2RM off blue
      PSU1 POK on green
      PSU1 SERVICE off amber
      PSU1 OK2RM off blue
      HDD0 SERVICE off amber
      HDD0 OK2RM off blue
      HDD1 SERVICE off amber
      HDD1 OK2RM off blue
      HDD2 SERVICE off amber
      HDD2 OK2RM off blue
      HDD3 SERVICE off amber
      HDD3 OK2RM off blue
      PSU2 POK on green
      PSU2 SERVICE off amber
      PSU2 OK2RM off blue
      PSU3 POK on green
      PSU3 SERVICE off amber
      PSU3 OK2RM off blue
      MB CRITICAL off red
      MB MAJOR off red
      MB MINOR off amber
      MB USER off amber
      FT0/F0 ACT on green
      FT0/F0 FAULT off amber
      FT1/F0 ACT on green
      FT1/F0 FAULT off amber
      FT2/F0 ACT on green
      FT2/F0 FAULT off amber

      =========================== FRU Operational Status ===========================
      ---------------------------------
      Fru Operational Status:
      ---------------------------------
      Location Status
      ---------------------------------
      SC okay
      HDD0 present
      HDD1 present
      HDD2 present
      HDD3 present
      PSU0 okay
      PSU1 okay
      PSU2 okay
      PSU3 okay

      ================================ HW Revisions ================================
      ASIC Revisions:
      -------------------------------------------------------------------
      Path Device Status Revision
      -------------------------------------------------------------------
      /pci@1c,600000 pci108e,a801 okay 4
      /pci@1d,700000 pci108e,a801 okay 4
      /pci@1e,600000 pci108e,a801 okay 4
      /pci@1f,700000 pci108e,a801 okay 4

      System PROM revisions:
      ----------------------
      OBP 4.30.4.a 2010/01/06 14:45 Sun Fire V440,Netra 440
      OBDIAG 4.30.4 2010/01/06 15:01

      Last week I ran SunVTS for 8 hours to stress test the machine, hoping it would through up some errors, but all it shows is one bad disk read:



      SunVTS Summary Test Report


      Latest Test Session Start Time: 10/24/12 16:12:20
      Latest Test Session End Time: 10/25/12 00:12:24

      Hostname: hmi1.com

      Logical Test Status

      Disk: PASS
      Environment: PASS
      Ioports: PASS
      Media: NO RESULT
      Memory: PASS
      Network: PASS
      Processor: PASS
      Graphics: PASS

      Faults Detected

      No Faults or Suspect Hardware Detected by FMA

      SunVTS Messages

      10/24/12 17:31:23 hmi1 SunVTS7.0ps11: VTSID 6005 Disk.diskmediatest.ERROR rdsk/c1t1d0: "I/O (read) request could not be completed successfully on block : 65024987, Error Message : I/O error"

      Syslog Messages

      Oct 24 09:49:24 hmi1 dtsession[13209]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
      Oct 24 11:57:40 hmi1 dtsession[13209]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
      Oct 24 12:19:53 hmi1 dtsession[1248]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
      Oct 24 12:54:13 hmi1 dtsession[25206]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
      End SunVTS Test Report


      Could this bad read be responsible for the warning light, and if so should I replace the disk? If not, any other ideas as to what may be wrong? Any help would be gratefully received as I have run out of ideas.

      Thanks in advance
      Doug
        • 1. Re: Netra 440 Maintenance light on
          Grantinho-Oracle
          Doug,

          Any disk issue would hardly ever result in amber LED. Could you please paste the 'showlogs -v' and 'showenvironment' outputs? This will be very helpfull.
          • 2. Re: Netra 440 Maintenance light on
            971153
            Hi Grantinho,

            Here's the showlogs -v output:

            sc> showlogs -v
            Persistent event log
            --------------------

            Log entries since OCT 24 22:08:43
            ----------------------------------
            OCT 24 22:08:43 hmi1: 00060022: "CRITICAL ALARM is cleared"
            OCT 24 22:08:43 hmi1: 00060022: "MAJOR ALARM is cleared"
            OCT 24 22:08:43 hmi1: 00060022: "MINOR ALARM is cleared"
            OCT 24 22:08:43 hmi1: 00060022: "USER ALARM is cleared"
            OCT 24 22:08:46 hmi1: 00060022: "MAJOR ALARM is cleared"
            OCT 24 22:08:46 hmi1: 00060022: "MINOR ALARM is cleared"
            OCT 24 22:08:46 hmi1: 00060022: "USER ALARM is cleared"
            OCT 24 22:08:47 hmi1: 00060021: "CRITICAL ALARM is set"
            OCT 24 22:08:48 hmi1: 00060021: "MAJOR ALARM is set"
            OCT 24 22:08:49 hmi1: 00060021: "MINOR ALARM is set"
            OCT 24 22:08:50 hmi1: 00060021: "USER ALARM is set"
            OCT 24 22:08:51 hmi1: 00060022: "CRITICAL ALARM is cleared"
            OCT 24 22:08:52 hmi1: 00060022: "MAJOR ALARM is cleared"
            OCT 24 22:08:54 hmi1: 00060022: "MINOR ALARM is cleared"
            OCT 24 22:08:55 hmi1: 00060022: "USER ALARM is cleared"
            <-- snip -->
            OCT 31 12:17:17 hmi1: 00060001: "SC Login Failure for user admin."
            OCT 31 12:27:03 hmi1: 00060000: "SC Login: User admin Logged on."

            There are multiple repeats of the alarms being set and cleared. These are all from the sunvts stress test I performed, so I don't think they're relevant here. Here's the showenvironment output:

            sc> showenvironment


            =============== Environmental Status ===============


            --------------------------------------------------------------------------------

            System Temperatures (Temperatures in Celsius):
            --------------------------------------------------------------------------------

            Sensor Status Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard

            --------------------------------------------------------------------------------

            C0.P0.T_CORE OK 89 -20 -10 0 120 123 127
            C1.P0.T_CORE OK 92 -20 -10 0 120 123 127
            C2.P0.T_CORE OK 90 -20 -10 0 120 123 127
            C3.P0.T_CORE OK 89 -20 -10 0 120 123 127

            --------------------------------------
            Front Status Panel:
            --------------------------------------
            Keyswitch position: NORMAL

            ---------------------------------------------------
            System Indicator Status:
            ---------------------------------------------------
            SYS.LOCATE SYS.SERVICE SYS.ACT
            --------------------------------------------------------
            OFF ON ON

            --------------------------------------------
            System Disks:
            --------------------------------------------
            Disk Status Service OK2RM
            --------------------------------------------
            HDD0 OK OFF OFF
            HDD1 OK OFF OFF
            HDD2 OK OFF OFF
            HDD3 OK OFF OFF

            ----------------------------------------------------------
            Fans (Speeds Revolution Per Minute):
            ----------------------------------------------------------
            Sensor Status Speed Warn Low
            ----------------------------------------------------------
            FT0.F0.TACH OK 4383 2400 750
            FT1.F0.TACH OK 4560 2400 750
            FT2.F0.TACH OK 4440 2400 750
            FT3.F0 OK -- -- --

            --------------------------------------------------------------------------------

            Voltage sensors (in Volts):
            --------------------------------------------------------------------------------

            Sensor Status Voltage LowSoft LowWarn HighWarn HighSoft
            --------------------------------------------------------------------------------

            MB.V_+1V5 OK 1.48 1.20 1.27 1.72 1.80
            MB.V_VCCTM OK 2.53 2.00 2.12 2.87 3.00
            MB.V_NET0_1V2D OK 1.26 0.96 1.02 1.38 1.44
            MB.V_NET1_1V2D OK 1.25 0.96 1.02 1.38 1.44
            MB.V_NET0_1V2A OK 1.25 0.96 1.02 1.38 1.44
            MB.V_NET1_1V2A OK 1.25 0.96 1.02 1.38 1.44
            MB.V_+3V3 OK 3.34 2.64 2.80 3.79 3.96
            MB.V_+3V3STBY OK 3.31 2.64 2.80 3.79 3.96
            MB.BAT.V_BAT OK 3.03 -- 2.25 -- --
            MB.V_SCSI_CORE OK 1.79 1.44 1.53 2.07 2.16
            MB.V_+5V OK 4.99 4.00 4.25 5.75 6.00
            MB.V_+12V OK 12.00 9.60 10.20 13.80 14.40
            MB.V_-12V OK -12.04 -14.40 -13.80 -10.20 -9.60

            --------------------------------------------
            Power Supply Indicators:
            --------------------------------------------
            Supply Active Service OK-to-Remove
            --------------------------------------------
            PS0 ON OFF OFF
            PS1 ON OFF OFF
            PS2 ON OFF OFF
            PS3 ON OFF OFF

            ------------------------------------------------------------------------------
            Power Supplies:
            ------------------------------------------------------------------------------
            Supply Status Underspeed Overtemp Overvolt Undervolt Overcurrent
            ------------------------------------------------------------------------------
            PS0 OK OFF OFF OFF OFF OFF
            PS1 OK OFF OFF OFF OFF OFF
            PS2 OK OFF OFF OFF OFF OFF
            PS3 OK OFF OFF OFF OFF OFF

            ----------------------
            Current sensors:
            ----------------------
            Sensor Status
            ----------------------
            MB.FF_SCSIA OK
            MB.FF_SCSIB OK
            MB.FF_POK OK
            C0.P0.FF_POK OK
            C1.P0.FF_POK OK
            C2.P0.FF_POK OK
            C3.P0.FF_POK OK

            --------------------------------------------
            System Alarms:
            --------------------------------------------
            Alarm Relay LED
            --------------------------------------------
            ALARM.CRITICAL OFF OFF
            ALARM.MAJOR OFF OFF
            ALARM.MINOR OFF OFF
            ALARM.USER OFF OFF

            And just for completeness, here's the firmware revision info:

            sc> showsc version -v
            Advanced Lights Out Manager v1.5.4
            SC Firmware version: 1.5.4
            SC Bootmon version: 1.5.4

            SC Bootmon Build Release: 08
            SC bootmon checksum: F08ACA76
            SC Bootmon built Oct 17 2005, 13:23:12

            SC Build Release: 08
            SC firmware checksum: 2E078305

            SC firmware built Oct 17 2005, 13:22:53
            SC firmware flashupdate FEB 06 2106, 06:28:15

            SC System Memory Size: 8 MB

            SC NVRAM Version = b

            SC hardware type: 1

            sc>

            Thanks for your help.
            Doug
            • 3. Re: Netra 440 Maintenance light on
              972367
              Hello,

              Could you please update the below commands output so that help us to identify the issue cause.

              #/var/adm/messages
              #iostat -En
              #echo | format
              #dmesg

              Sidh

              Edited by: 969364 on Nov 4, 2012 2:34 AM
              • 4. Re: Netra 440 Maintenance light on
                971153
                Hi Sidh,

                Thanks for your help, but I have now resolved the issue. I am satisfied there are no hardware issues with the machine, so resetting the ALOM via resetsc cleared the alarm.

                Doug
                • 5. Re: Netra 440 Maintenance light on
                  You posted this same question to at least three other forums:
                  http://www.unix.com/solaris/205147-netra-440-maintenance-light.html
                  http://www.tek-tips.com/viewthread.cfm?qid=1696916
                  http://www.linuxine.com/story/netra-440-maintenance-light
                  ... and failed to mention that fact, anywhere.

                  That's poor forum etiquette.
                  Since no one at any of these forums are paid to respond, why should anyone spend any effort giving advice that you may have already received elsewhere? That would be wasted effort.

                  Be a bit more considerate in the future.