3 Replies Latest reply: Apr 7, 2014 8:13 AM by cindys RSS

Disk drive LED lights forever after update to Solaris 11.1

KarelGardas Newbie
Currently Being Moderated

Hello,

I already reported this issue here:Disk drive LED lights all time after Solaris 11 upgrade to 11.1 (second)

but to summarise: I do have supermicro x9sra board and two drive connected to sata6/4 and sata6/5 ports, one ssd connected to sata6/1 port. The hardware runs fine with Solaris 11 11/11 and with Solaris 11.0. After upgrade to Solaris 11.1 I've noticed that hdd LED is turn forever.

Now, my new observation on this is that this does not happen just right after the boot to Solaris 11.1. In fact if I boot to single-user more, then the behaviour of hdd LED is normal, it just blinks on hdd access. So my conclusion is that some service started in multi-user mode is responsible for this behaviour and I would definitely like to find out which it is.

Is there any trick how to start services in step by step manner preferably also with printing to console service name and delay between services start? That would help a lot for debugging which service is responsible for this behaviour.

Please note that zpool iostat 1 shows normal drive activity, also vmstat 1 shows normal drive activity so there is no reason why the LED stays on forever...

Thanks for any hint how to debug this issue!
Karel

  • 1. Re: Disk drive LED lights forever after update to Solaris 11.1
    cindys Pro
    Currently Being Moderated

    Hi Karel,

     

    I would be curious is know what the system thinks of the overall device health:

     

    # zpool status -v

    # fmadm faulty

    # fmdump -eV | more

     

    It could be Solaris 11.1 fault detection is more sensitive.

     

    Thanks, Cindy

  • 2. Re: Disk drive LED lights forever after update to Solaris 11.1
    KarelGardas Newbie
    Currently Being Moderated

    Hi Cindy,

     

    sorry for such a long delay. Your suggested commands certainly reveals something. zpool status -v is clear except the advice for update. You can see it here:

    # zpool status -v pool: data2 state: ONLINE status: The pool is formatted - Pastebin.com

     

    fmadm faulty is a litte bit more verbose: --------------- ------------------------------------ -------------- --------- - Pastebin.com

     

    Now, the most interesting here is fmdump -eV, it's really verbose so I've saved it to box here: https://app.box.com/s/0p7ljgfqam9nf6ov3f5r

     

    What caught my eye particularly are errors like this:

     

    Apr 05 2014 15:51:52.400881863 ereport.io.scsi.cmd.disk.dev.rqs.derr
    nvlist version: 0
            class = ereport.io.scsi.cmd.disk.dev.rqs.derr
            ena = 0x6bc778e23b02001
            detector = (embedded nvlist)
            nvlist version: 0
                    version = 0x0
                    scheme = dev
                    cna_dev = 0x534008d200000003
                    device-path = /pci@0,0/pci15d9,62a@1f,2/disk@5,0
                    devid = id1,sd@SATA_____WDC_WD7500BPKT-7_____WD-WX91A33F3048
            (end detector)

     

            devid = id1,sd@SATA_____WDC_WD7500BPKT-7_____WD-WX91A33F3048
            driver-assessment = info
            op-code = 0xa1
            cdb = 0xa1 0x6 0x2c 0xda 0x0 0x0 0x4f 0xc2 0x0 0xb0 0x0 0x0
            pkt-reason = 0x0
            pkt-state = 0x3f
            pkt-stats = 0x0
            stat-code = 0x2
            key = 0x1
            asc = 0x0
            ascq = 0x0
            sense-data = 0x72 0x1 0x0 0x0 0x0 0x0 0x0 0xe 0x9 0xc 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x4f 0x0 0xc2 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
    0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x
    0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0
    x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
    0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
    0x0 0x0 0x0 0x0 0x0 0x0
            __ttl = 0x1
            __tod = 0x53400a78 0x17e4f8c7

     

     

    There are several instances of those errors in the log. I'm not sure if it was not caused by setapm application[1] so I removed this and I'm using smartctl -s apm,off <drive> now. The question is are those errors related to setting drive power management or not? If not, then what are they?

    Also, since I've disable this (setapm) and rebooted several times from it, still I see LED switched on forever in multi-user runlevel. If the error is related and then it looks like it's kind of persistent, then my question is how to tell Solaris to ignore particular error?

     

    Thanks a lot!
    Karel

    [1]:Set APM and AAM feature configuration attributes for disks on Solaris - Solarismen

  • 3. Re: Disk drive LED lights forever after update to Solaris 11.1
    cindys Pro
    Currently Being Moderated

    Hi Karel,

     

    What kind of system and devices? I was unaware of the setapm issues on Solaris.

     

    Even though the LED stays on, are the FMA errors still generated?

     

    Thanks, Cindy

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points