This discussion is archived
5 Replies Latest reply: Nov 12, 2012 9:34 AM by rukbat RSS

Netra 440 Maintenance light on

971153 Newbie
Currently Being Moderated
Hi All,

I have a Netra 440 server which has been showing an amber maintenance light for some time now. It first appeared when I upgraded the memory modules, but the new memory is working fine. All other components seem OK and the machine is up and running. I am completely baffled as to what the problem could be. Prtdiag shows no problems:


5ksh# ./prtdiag -v
System Configuration: Sun Microsystems sun4u Netra 440
System clock frequency: 177 MHZ
Memory size: 16GB

==================================== CPUs ====================================
E$ CPU CPU
CPU Freq Size Implementation Mask Status Location
--- -------- ---------- --------------------- ----- ------ --------
0 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
1 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
2 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -
3 1593 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line -

================================= IO Devices =================================
Bus Freq Slot + Name +
Type MHz Status Path Model
------ ---- ---------- ---------------------------- --------------------
pci 66 PCI5 SUNW,XVR-100 (display) SUNW,375-3290
okay /pci@1c,600000/SUNW,XVR-100@1

pci 66 MB pci108e,abba (network) SUNW,pci-ce
okay /pci@1c,600000/network@2

pci 33 MB isa/su (serial)
okay /pci@1e,600000/isa@7/serial@0,3f8

pci 33 MB isa/su (serial)
okay /pci@1e,600000/isa@7/serial@0,2e8

pci 33 MB isa/rmc-comm-rmc_comm (seria+
okay /pci@1e,600000/isa@7/rmc-comm@0,3e8

pci 33 MB pci10b9,5229 (ide)
okay /pci@1e,600000/ide@d

pci 66 MB pci108e,abba (network) SUNW,pci-ce
okay /pci@1f,700000/network@1

pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
okay /pci@1f,700000/scsi@2

pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
okay /pci@1f,700000/scsi@2,1


============================ Memory Configuration ============================
Segment Table:
-----------------------------------------------------------------------
Base Address Size Interleave Factor Contains
-----------------------------------------------------------------------
0x0 4GB 16 BankIDs 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
0x1000000000 4GB 16 BankIDs 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
0x2000000000 4GB 16 BankIDs 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47
0x3000000000 4GB 16 BankIDs 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63

Bank Table:
-----------------------------------------------------------
Physical Location
ID ControllerID GroupID Size Interleave Way
-----------------------------------------------------------
0 0 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
1 0 0 256MB
2 0 1 256MB
3 0 1 256MB
4 0 0 256MB
5 0 0 256MB
6 0 1 256MB
7 0 1 256MB
8 0 1 256MB
9 0 1 256MB
10 0 0 256MB
11 0 0 256MB
12 0 1 256MB
13 0 1 256MB
14 0 0 256MB
15 0 0 256MB
16 1 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
17 1 0 256MB
18 1 1 256MB
19 1 1 256MB
20 1 0 256MB
21 1 0 256MB
22 1 1 256MB
23 1 1 256MB
24 1 1 256MB
25 1 1 256MB
26 1 0 256MB
27 1 0 256MB
28 1 1 256MB
29 1 1 256MB
30 1 0 256MB
31 1 0 256MB
32 2 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
33 2 0 256MB
34 2 1 256MB
35 2 1 256MB
36 2 0 256MB
37 2 0 256MB
38 2 1 256MB
39 2 1 256MB
40 2 1 256MB
41 2 1 256MB
42 2 0 256MB
43 2 0 256MB
44 2 1 256MB
45 2 1 256MB
46 2 0 256MB
47 2 0 256MB
48 3 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
49 3 0 256MB
50 3 1 256MB
51 3 1 256MB
52 3 0 256MB
53 3 0 256MB
54 3 1 256MB
55 3 1 256MB
56 3 1 256MB
57 3 1 256MB
58 3 0 256MB
59 3 0 256MB
60 3 1 256MB
61 3 1 256MB
62 3 0 256MB
63 3 0 256MB

Memory Module Groups:
--------------------------------------------------
ControllerID GroupID Labels Status
--------------------------------------------------
0 0 C0/P0/B0/D0
0 0 C0/P0/B0/D1
0 1 C0/P0/B1/D0
0 1 C0/P0/B1/D1
1 0 C1/P0/B0/D0
1 0 C1/P0/B0/D1
1 1 C1/P0/B1/D0
1 1 C1/P0/B1/D1
2 0 C2/P0/B0/D0
2 0 C2/P0/B0/D1
2 1 C2/P0/B1/D0
2 1 C2/P0/B1/D1
3 0 C3/P0/B0/D0
3 0 C3/P0/B0/D1
3 1 C3/P0/B1/D0
3 1 C3/P0/B1/D1

=============================== usb Devices ===============================

Name Port#
------------ -----
hub 2

=============================== hub#2 Devices ===============================

Name Port#
------------ -----
keyboard 4
mouse 1

============================ Environmental Status ============================
Fan Status:
-------------------------------------------
Location Sensor Status
-------------------------------------------
FT0/F0 TACH okay
FT1/F0 TACH okay
FT2/F0 TACH okay
FT3/F0 F0 okay
PSU0 FF_FAN okay
PSU1 FF_FAN okay
PSU2 FF_FAN okay
PSU3 FF_FAN okay

Temperature sensors:
-----------------------------------------
Location Sensor Status
-----------------------------------------
C0/P0 T_CORE okay
C1/P0 T_CORE okay
C2/P0 T_CORE okay
C3/P0 T_CORE okay
PSU0 FF_OT okay
PSU1 FF_OT okay
PSU2 FF_OT okay
PSU3 FF_OT okay
------------------------------------
Current sensors:
----------------------------------------
Location Sensor Status
----------------------------------------
MB FF_SCSIA okay
MB FF_SCSIB okay
MB FF_POK okay
C0/P0 FF_POK okay
C1/P0 FF_POK okay
C2/P0 FF_POK okay
C3/P0 FF_POK okay
PSU0 FF_OC okay
PSU1 FF_OC okay
PSU2 FF_OC okay
PSU3 FF_OC okay
------------------------------------
Voltage sensors:
-----------------------------------
Location Sensor Status
-----------------------------------
MB V_+1V5 okay
MB V_VCCTM okay
MB V_NET0_1V2D okay
MB V_NET1_1V2D okay
MB V_NET0_1V2A okay
MB V_NET1_1V2A okay
MB V_+3V3 okay
MB V_+3V3STBY okay
MB/BAT V_BAT okay
MB V_SCSI_CORE okay
MB V_+5V okay
MB V_+12V okay
MB V_-12V okay
PSU0 P_PWR okay
PSU0 FF_POK okay
PSU0 FF_UV okay
PSU0 FF_OV okay
PSU1 P_PWR okay
PSU1 FF_POK okay
PSU1 FF_UV okay
PSU1 FF_OV okay
PSU2 P_PWR okay
PSU2 FF_POK okay
PSU2 FF_UV okay
PSU2 FF_OV okay
PSU3 P_PWR okay
PSU3 FF_POK okay
PSU3 FF_UV okay
PSU3 FF_OV okay
-----------------------------------------
Keyswitch:
-----------------------------------------
Location Keyswitch State
-----------------------------------------
SYS SYSCTRL NORMAL
--------------------------------------------------
Led State:
--------------------------------------------------------------
Location Led State Color
--------------------------------------------------------------
SYS ACT on green
SYS SERVICE on amber
SYS LOCATE off white
PSU0 POK on green
PSU0 SERVICE off amber
PSU0 OK2RM off blue
PSU1 POK on green
PSU1 SERVICE off amber
PSU1 OK2RM off blue
HDD0 SERVICE off amber
HDD0 OK2RM off blue
HDD1 SERVICE off amber
HDD1 OK2RM off blue
HDD2 SERVICE off amber
HDD2 OK2RM off blue
HDD3 SERVICE off amber
HDD3 OK2RM off blue
PSU2 POK on green
PSU2 SERVICE off amber
PSU2 OK2RM off blue
PSU3 POK on green
PSU3 SERVICE off amber
PSU3 OK2RM off blue
MB CRITICAL off red
MB MAJOR off red
MB MINOR off amber
MB USER off amber
FT0/F0 ACT on green
FT0/F0 FAULT off amber
FT1/F0 ACT on green
FT1/F0 FAULT off amber
FT2/F0 ACT on green
FT2/F0 FAULT off amber

=========================== FRU Operational Status ===========================
---------------------------------
Fru Operational Status:
---------------------------------
Location Status
---------------------------------
SC okay
HDD0 present
HDD1 present
HDD2 present
HDD3 present
PSU0 okay
PSU1 okay
PSU2 okay
PSU3 okay

================================ HW Revisions ================================
ASIC Revisions:
-------------------------------------------------------------------
Path Device Status Revision
-------------------------------------------------------------------
/pci@1c,600000 pci108e,a801 okay 4
/pci@1d,700000 pci108e,a801 okay 4
/pci@1e,600000 pci108e,a801 okay 4
/pci@1f,700000 pci108e,a801 okay 4

System PROM revisions:
----------------------
OBP 4.30.4.a 2010/01/06 14:45 Sun Fire V440,Netra 440
OBDIAG 4.30.4 2010/01/06 15:01

Last week I ran SunVTS for 8 hours to stress test the machine, hoping it would through up some errors, but all it shows is one bad disk read:



SunVTS Summary Test Report


Latest Test Session Start Time: 10/24/12 16:12:20
Latest Test Session End Time: 10/25/12 00:12:24

Hostname: hmi1.com

Logical Test Status

Disk: PASS
Environment: PASS
Ioports: PASS
Media: NO RESULT
Memory: PASS
Network: PASS
Processor: PASS
Graphics: PASS

Faults Detected

No Faults or Suspect Hardware Detected by FMA

SunVTS Messages

10/24/12 17:31:23 hmi1 SunVTS7.0ps11: VTSID 6005 Disk.diskmediatest.ERROR rdsk/c1t1d0: "I/O (read) request could not be completed successfully on block : 65024987, Error Message : I/O error"

Syslog Messages

Oct 24 09:49:24 hmi1 dtsession[13209]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
Oct 24 11:57:40 hmi1 dtsession[13209]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
Oct 24 12:19:53 hmi1 dtsession[1248]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
Oct 24 12:54:13 hmi1 dtsession[25206]: [ID 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials
End SunVTS Test Report


Could this bad read be responsible for the warning light, and if so should I replace the disk? If not, any other ideas as to what may be wrong? Any help would be gratefully received as I have run out of ideas.

Thanks in advance
Doug
  • 1. Re: Netra 440 Maintenance light on
    963543 Newbie
    Currently Being Moderated
    Doug,

    Any disk issue would hardly ever result in amber LED. Could you please paste the 'showlogs -v' and 'showenvironment' outputs? This will be very helpfull.
  • 2. Re: Netra 440 Maintenance light on
    971153 Newbie
    Currently Being Moderated
    Hi Grantinho,

    Here's the showlogs -v output:

    sc> showlogs -v
    Persistent event log
    --------------------

    Log entries since OCT 24 22:08:43
    ----------------------------------
    OCT 24 22:08:43 hmi1: 00060022: "CRITICAL ALARM is cleared"
    OCT 24 22:08:43 hmi1: 00060022: "MAJOR ALARM is cleared"
    OCT 24 22:08:43 hmi1: 00060022: "MINOR ALARM is cleared"
    OCT 24 22:08:43 hmi1: 00060022: "USER ALARM is cleared"
    OCT 24 22:08:46 hmi1: 00060022: "MAJOR ALARM is cleared"
    OCT 24 22:08:46 hmi1: 00060022: "MINOR ALARM is cleared"
    OCT 24 22:08:46 hmi1: 00060022: "USER ALARM is cleared"
    OCT 24 22:08:47 hmi1: 00060021: "CRITICAL ALARM is set"
    OCT 24 22:08:48 hmi1: 00060021: "MAJOR ALARM is set"
    OCT 24 22:08:49 hmi1: 00060021: "MINOR ALARM is set"
    OCT 24 22:08:50 hmi1: 00060021: "USER ALARM is set"
    OCT 24 22:08:51 hmi1: 00060022: "CRITICAL ALARM is cleared"
    OCT 24 22:08:52 hmi1: 00060022: "MAJOR ALARM is cleared"
    OCT 24 22:08:54 hmi1: 00060022: "MINOR ALARM is cleared"
    OCT 24 22:08:55 hmi1: 00060022: "USER ALARM is cleared"
    <-- snip -->
    OCT 31 12:17:17 hmi1: 00060001: "SC Login Failure for user admin."
    OCT 31 12:27:03 hmi1: 00060000: "SC Login: User admin Logged on."

    There are multiple repeats of the alarms being set and cleared. These are all from the sunvts stress test I performed, so I don't think they're relevant here. Here's the showenvironment output:

    sc> showenvironment


    =============== Environmental Status ===============


    --------------------------------------------------------------------------------

    System Temperatures (Temperatures in Celsius):
    --------------------------------------------------------------------------------

    Sensor Status Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard

    --------------------------------------------------------------------------------

    C0.P0.T_CORE OK 89 -20 -10 0 120 123 127
    C1.P0.T_CORE OK 92 -20 -10 0 120 123 127
    C2.P0.T_CORE OK 90 -20 -10 0 120 123 127
    C3.P0.T_CORE OK 89 -20 -10 0 120 123 127

    --------------------------------------
    Front Status Panel:
    --------------------------------------
    Keyswitch position: NORMAL

    ---------------------------------------------------
    System Indicator Status:
    ---------------------------------------------------
    SYS.LOCATE SYS.SERVICE SYS.ACT
    --------------------------------------------------------
    OFF ON ON

    --------------------------------------------
    System Disks:
    --------------------------------------------
    Disk Status Service OK2RM
    --------------------------------------------
    HDD0 OK OFF OFF
    HDD1 OK OFF OFF
    HDD2 OK OFF OFF
    HDD3 OK OFF OFF

    ----------------------------------------------------------
    Fans (Speeds Revolution Per Minute):
    ----------------------------------------------------------
    Sensor Status Speed Warn Low
    ----------------------------------------------------------
    FT0.F0.TACH OK 4383 2400 750
    FT1.F0.TACH OK 4560 2400 750
    FT2.F0.TACH OK 4440 2400 750
    FT3.F0 OK -- -- --

    --------------------------------------------------------------------------------

    Voltage sensors (in Volts):
    --------------------------------------------------------------------------------

    Sensor Status Voltage LowSoft LowWarn HighWarn HighSoft
    --------------------------------------------------------------------------------

    MB.V_+1V5 OK 1.48 1.20 1.27 1.72 1.80
    MB.V_VCCTM OK 2.53 2.00 2.12 2.87 3.00
    MB.V_NET0_1V2D OK 1.26 0.96 1.02 1.38 1.44
    MB.V_NET1_1V2D OK 1.25 0.96 1.02 1.38 1.44
    MB.V_NET0_1V2A OK 1.25 0.96 1.02 1.38 1.44
    MB.V_NET1_1V2A OK 1.25 0.96 1.02 1.38 1.44
    MB.V_+3V3 OK 3.34 2.64 2.80 3.79 3.96
    MB.V_+3V3STBY OK 3.31 2.64 2.80 3.79 3.96
    MB.BAT.V_BAT OK 3.03 -- 2.25 -- --
    MB.V_SCSI_CORE OK 1.79 1.44 1.53 2.07 2.16
    MB.V_+5V OK 4.99 4.00 4.25 5.75 6.00
    MB.V_+12V OK 12.00 9.60 10.20 13.80 14.40
    MB.V_-12V OK -12.04 -14.40 -13.80 -10.20 -9.60

    --------------------------------------------
    Power Supply Indicators:
    --------------------------------------------
    Supply Active Service OK-to-Remove
    --------------------------------------------
    PS0 ON OFF OFF
    PS1 ON OFF OFF
    PS2 ON OFF OFF
    PS3 ON OFF OFF

    ------------------------------------------------------------------------------
    Power Supplies:
    ------------------------------------------------------------------------------
    Supply Status Underspeed Overtemp Overvolt Undervolt Overcurrent
    ------------------------------------------------------------------------------
    PS0 OK OFF OFF OFF OFF OFF
    PS1 OK OFF OFF OFF OFF OFF
    PS2 OK OFF OFF OFF OFF OFF
    PS3 OK OFF OFF OFF OFF OFF

    ----------------------
    Current sensors:
    ----------------------
    Sensor Status
    ----------------------
    MB.FF_SCSIA OK
    MB.FF_SCSIB OK
    MB.FF_POK OK
    C0.P0.FF_POK OK
    C1.P0.FF_POK OK
    C2.P0.FF_POK OK
    C3.P0.FF_POK OK

    --------------------------------------------
    System Alarms:
    --------------------------------------------
    Alarm Relay LED
    --------------------------------------------
    ALARM.CRITICAL OFF OFF
    ALARM.MAJOR OFF OFF
    ALARM.MINOR OFF OFF
    ALARM.USER OFF OFF

    And just for completeness, here's the firmware revision info:

    sc> showsc version -v
    Advanced Lights Out Manager v1.5.4
    SC Firmware version: 1.5.4
    SC Bootmon version: 1.5.4

    SC Bootmon Build Release: 08
    SC bootmon checksum: F08ACA76
    SC Bootmon built Oct 17 2005, 13:23:12

    SC Build Release: 08
    SC firmware checksum: 2E078305

    SC firmware built Oct 17 2005, 13:22:53
    SC firmware flashupdate FEB 06 2106, 06:28:15

    SC System Memory Size: 8 MB

    SC NVRAM Version = b

    SC hardware type: 1

    sc>

    Thanks for your help.
    Doug
  • 3. Re: Netra 440 Maintenance light on
    972367 Newbie
    Currently Being Moderated
    Hello,

    Could you please update the below commands output so that help us to identify the issue cause.

    #/var/adm/messages
    #iostat -En
    #echo | format
    #dmesg

    Sidh

    Edited by: 969364 on Nov 4, 2012 2:34 AM
  • 4. Re: Netra 440 Maintenance light on
    971153 Newbie
    Currently Being Moderated
    Hi Sidh,

    Thanks for your help, but I have now resolved the issue. I am satisfied there are no hardware issues with the machine, so resetting the ALOM via resetsc cleared the alarm.

    Doug
  • 5. Re: Netra 440 Maintenance light on
    rukbat Guru Moderator
    Currently Being Moderated
    You posted this same question to at least three other forums:
    http://www.unix.com/solaris/205147-netra-440-maintenance-light.html
    http://www.tek-tips.com/viewthread.cfm?qid=1696916
    http://www.linuxine.com/story/netra-440-maintenance-light
    ... and failed to mention that fact, anywhere.

    That's poor forum etiquette.
    Since no one at any of these forums are paid to respond, why should anyone spend any effort giving advice that you may have already received elsewhere? That would be wasted effort.

    Be a bit more considerate in the future.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points