This discussion is archived
4 Replies Latest reply: May 7, 2013 6:21 AM by 990261 RSS

System Crash Fatal

990261 Newbie
Currently Being Moderated
Over the weekend I had a E250 crash. I was able to get it backup and running but I am not sure why it crashed. I am pretty sure that it has hardware issue. A user reported that it has been running slow for about three weeks. Two weeks ago I had to restore the root drive from backup because it crashed and I was getting a BAD SUPER BLOCK: MAGIC NUMBER WRONG. Today I came in and it was giving a BAD SUPER BLOCK: MAGIC NUMBER WRONG on /dev/rdsk/c0t0d0s1 which is the swap partition. I got lucky since it was the swap partition I just recreated the filesystem and it booted right up.

My guess is bad CPU or bad memory module. If it is a memory module how do I know which one?

Here is what was in the /var/adm/message log file. Thanks.

May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 805541 kern.warning] WARNING: [AFT1] AFAR was derived from UE rep
ort, CP event on CPU0 (caused access error on IOBUS31), errID 0x00055553.d7f6661b
May 4 23:13:53 orca AFSR 0x00000000.01000001<CP> AFAR 0x00000000.3f209748
May 4 23:13:53 orca AFSR.PSYND 0x0001(Score 95) AFSR.ETS 0x00
May 4 23:13:53 orca UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND 0x00
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 728641 kern.info] [AFT2] errID 0x00055553.d7f6661b PA=0x00000000.
3f209748
May 4 23:13:53 orca E$tag 0x00000000.0bc007e4 E$State: Modified E$parity 0x05
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x00): 0x80000000.0486a012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data (0x08): 0x80000000.03c6c01a Bad
PSYND=0x0001
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x10): 0x80000000.1dc6e012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x18): 0x80000000.20a70012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x20): 0x80000000.06272012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x28): 0x80000000.17c74012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x30): 0x80000000.22e76012
May 4 23:13:53 orca SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data (0x38): 0x80000000.1a678012
May 4 23:13:53 orca pcipsy: [ID 139652 kern.warning] WARNING: uncorrectable error detected by pci0 (upa mid 1
f) during
May 4 23:13:53 orca DVMA read transaction
May 4 23:13:54 orca pcipsy: [ID 475334 kern.info] Transaction was a block operation.
May 4 23:13:54 orca pcipsy: [ID 750218 kern.info] AFSR=40000000.3f800000 AFAR=00000000.3f209748,
May 4 23:13:54 orca double word offset=1, Memory Module U0702 U0802 id 31.
May 4 23:13:54 orca unix: [ID 836849 kern.notice]
May 4 23:13:54 orca ^Mpanic[cpu0]/thread=300059eb440:
May 4 23:13:54 orca unix: [ID 261965 kern.notice] Fatal PCI UE Error
May 4 23:13:54 orca unix: [ID 100000 kern.notice]
May 4 23:13:54 orca genunix: [ID 723222 kern.notice] 000002a100077e60 pcipsy:ecc_intr+1ac (3f209740, 40000000
3f800000, 300007bde78, 3000005f908, 1f, 10242ab4)
May 4 23:13:54 orca genunix: [ID 179002 kern.notice] %l0-3: 0000000000000008 0000000000004000 0000000000000
000 000000000000a568
May 4 23:13:54 orca %l4-7: 0000000000008b78 0000000000008b40 0000000000000000 0000000100279b90
May 4 23:13:54 orca genunix: [ID 723222 kern.notice] 000002a100077f50 unix:current_thread+44 (10, 2, 0, 1003f
9060, 1005460c0, 1000)
May 4 23:13:55 orca genunix: [ID 179002 kern.notice] %l0-3: 00000000100072a4 000002a1009a52f1 0000000000000
00e 0000000000000016
May 4 23:13:55 orca %l4-7: 0000000000000000 0000000000000000 00000300059eb440 000002a1009a5ba0
May 4 23:13:55 orca unix: [ID 100000 kern.notice]
May 4 23:13:55 orca genunix: [ID 672855 kern.notice] syncing file systems...
May 4 23:13:55 orca genunix: [ID 904073 kern.notice] done
May 4 23:13:56 orca genunix: [ID 353387 kern.notice] dumping to /dev/dsk/c0t0d0s1, offset 1288699904
May 4 23:13:56 orca scsi: [ID 107833 kern.warning] WARNING: /pci@1f,4000/scsi@3 (glm0):
May 4 23:13:56 orca got SCSI bus reset
May 4 23:13:57 orca genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still avai
lable
May 4 23:13:57 orca genunix: [ID 611667 kern.info] NOTICE: glm0: got SCSI bus reset
May 4 23:14:13 orca genunix: [ID 409368 kern.notice] ^M100% done: 25399 pages dumped, compression ratio 2.65,

May 4 23:14:13 orca genunix: [ID 851671 kern.notice] dump succeeded
May 4 23:15:43 orca genunix: [ID 540533 kern.notice] ^MSunOS Release 5.8 Version Generic_108528-29 64-bit
May 4 23:15:43 orca genunix: [ID 913632 kern.notice] Copyright 1983-2003 Sun Microsystems, Inc. All rights r
eserved.
May 4 23:15:43 orca genunix: [ID 678236 kern.info] Ethernet address = 0:3:ba:3:5:12
May 4 23:15:43 orca unix: [ID 389951 kern.info] mem = 1048576K (0x40000000)
May 4 23:15:43 orca unix: [ID 930857 kern.info] avail mem = 1025564672
May 4 23:15:43 orca rootnex: [ID 466748 kern.info] root nexus = Sun (TM) Enterprise 250 (UltraSPARC-II 400MHz
)
May 4 23:15:43 orca rootnex: [ID 349649 kern.info] pcipsy0 at root: UPA 0x1f 0x4000
May 4 23:15:43 orca genunix: [ID 936769 kern.info] pcipsy0 is /pci@1f,4000
May 4 23:15:43 orca rootnex: [ID 349649 kern.info] pcipsy1 at root: UPA 0x1f 0x2000
May 4 23:15:43 orca genunix: [ID 936769 kern.info] pcipsy1 is /pci@1f,2000
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca Rev. 5 Symbios 53c875 found.
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target1-scsi-options=0x5f8
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target2-scsi-options=0x5f8
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target3-scsi-options=0x5f8
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target4-scsi-options=0x5f8
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target5-scsi-options=0x5f8
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3 (glm0):
May 4 23:15:43 orca target6-scsi-options=0x5f8
May 4 23:15:43 orca pcipsy: [ID 370704 kern.info] PCI-device: scsi@3, glm0
May 4 23:15:43 orca genunix: [ID 936769 kern.info] glm0 is /pci@1f,4000/scsi@3
May 4 23:15:43 orca scsi: [ID 365881 kern.info] /pci@1f,4000/scsi@3,1 (glm1):
May 4 23:15:43 orca Rev. 5 Symbios 53c875 found.
May 4 23:15:43 orca pcipsy: [ID 370704 kern.info] PCI-device: scsi@3,1, glm1
May 4 23:15:43 orca genunix: [ID 936769 kern.info] glm1 is /pci@1f,4000/scsi@3,1
May 4 23:15:43 orca scsi: [ID 193665 kern.info] sd0 at glm0: target 0 lun 0
  • 1. Re: System Crash Fatal
    Nik Expert
    Currently Being Moderated
    Hi.
    It's look like uncorrectable memory error.
    System detect this error on Memory Module U0702 U0802

    Regards.
  • 2. Re: System Crash Fatal
    990261 Newbie
    Currently Being Moderated
    How am I suppose to know which module since Bank 0 and Bank 1 both have a U0702 and a U0802?

    ========================= Memory =========================

    Interlv. Socket Size
    Bank Group Name (MB) Status
    ---- ----- ------ ---- ------
    0 none U0701 128 OK
    0 none U0801 128 OK
    0 none U0901 128 OK
    0 none U1001 128 OK
    0 none U0702 128 OK
    0 none U0802 128 OK
    0 none U0902 128 OK
    0 none U1002 128 OK
    1 none U0702 128 OK
    1 none U0802 128 OK
    1 none U0902 128 OK
    1 none U1002 128 OK
  • 3. Re: System Crash Fatal
    Nik Expert
    Currently Being Moderated
    Hi.
    Open cover and you can find only one DIMM U0702 and one U0802.

    Every dimm have two logical banks so prtdiag show every dimm twise.


    Regagards.
  • 4. Re: System Crash Fatal
    990261 Newbie
    Currently Being Moderated
    Once I removed the cover on an old system I noticed on the board that the DIMM slots are labeled. I hope this is the fix.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points