This discussion is archived
8 Replies Latest reply: Oct 21, 2013 12:55 PM by user9062184 RSS

Solaris 11.1 stuck at npe0 is /pci@0,0

user9062184 Newbie
Currently Being Moderated

Hi all.

 

Something a bit sad, here. One of my happiest, best running file servers (white box) running Solaris 11.1 (x86) has seemingly failed me after about 78 days of uptime since the last patch. I'll explain the background:

 

* Board: GA-Z77-UD5H, LGA1150

* RAM: 16GB

* Inte Core i5 current gen (well, one back from Haswell)

* Dual 7200 RPM 2.5" SATA boot drives in a ZFS rpool mirror

* OS: Solaris 11.1 with support repositories.


Box had been up for about 76 days. Thought it was sane to give it a patch and a reboot!


pkg image-update -v etc etc.

 

Made a new boot environment, and off we go.


Reboots now hang at banner, consistently. Even when I rip all the disk out of the system, swapped all the DIMM's out, removed any extra PCI-E network devices (there were none), reset CMOS checksum defaults, upgraded the BIOS, tried booting from the CD...still, hangs at banner. No matter what I do, it hangs at banner. Driven me nuts all day. I managed to boot Windows and ubuntu just fine. Wondering what on earth is going on here, to that end. Started to assume hardware, but I don't think that's the case.


When I add a -v to my boot args, I see:

 

SMBIOS v2.7 loaded (10333 bytes)initialized model-specific module 'cpu_ms.GenuineIntel' on chip 0 core 0 strand 0
root nexus = i86pc
pseudo0 at root
pseudo0 is /pseudo
scsi_vhci0 at root
scsi_vhci0 is /scsi_vhci
npe0 at root: space 0 offset 0
npe0 is /pci@0,0


 

And this is as far as I ever get. I've tried several things, rolling back boot environments, ripping entire disk sub system out etc, and nothing. It's like the system has "changed" somehow in hardware, which isn't possible, and my previous boot environments that were running fine no longer work.

Any help or direction would be so very much appreciated!

Thanks, all.
--z
  • 1. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    cindys Pro
    Currently Being Moderated

    Which Solaris 11.1 SRU is this?

     

    Thanks, Cindy

  • 2. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    user9062184 Newbie
    Currently Being Moderated

    Sorry, I don't know. If I could boot it at all, I could simply run a pkg info entire and it would tell me.

     

    We can assume it's a very current SRU, on the basis that I kept it very well updated and patched from the support repository on a monthly/bi-monthly basis. Sorry,I just don't remember the SRU string off the top of my head .

     

    I wonder if somehow finding an older Solaris 11 disk would help it?

     

    z

  • 3. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    cindys Pro
    Currently Being Moderated

    Hi Z,

     

    I think you said you patched (or updated) this system and now it hangs at boot time.

    Can you boot back to the previous BE?

     

    Thanks, Cindy

  • 4. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    user9062184 Newbie
    Currently Being Moderated

    Hi.

     

    Nar, unfortunately, it doesn't seem to help. I flip back to the previous BE and the behaviour is the same. I actually now believe it's hardware related. I got the SRU Out of it in the end, too. It's 11.1.9.5.1

     

    I have a screenshot of it hanging on the npe driver module load, if that's useful?

  • 5. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    cindys Pro
    Currently Being Moderated

    Yes, I'll try to look up the npe driver message later but if this is a hardware issue, it probably won't help.

    If the problem is one of your root pool disks, I would expect a more obvious error message. There might be

    a better way but if you can boot from media or an install server and attempt to import the root pool, then that

    would give us more clues.

     

    Thanks, Cindy

  • 6. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    user9062184 Newbie
    Currently Being Moderated

    Yep.

     

    Tried that. No media now boots at all on the host. It gets to exactly the same point trying to load the npe driver, then hangs the whole host, even from a live disc, even when all HDD's are unplugged, even with a full BIOS reset. Pushing the BIOS back one or two revisions made it sort-of-kind-of-work, but it didn't help. Still hangs and panics. I think it might be hw related, but I cannot be sure.

     

    Thoughts?

     

    -jc

  • 7. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    cindys Pro
    Currently Being Moderated

    Did it panic before? I thought it just hung. What is the panic string? I haven't found anything about npe hanging. There was a bug with an error: WARNING: npe1: no ranges property but this isn't it. There was an issue with some x86 systems needing ACPI disabled but you would have had issues previously.

     

    Thanks, Cindy

  • 8. Re: Solaris 11.1 stuck at npe0 is /pci@0,0
    user9062184 Newbie
    Currently Being Moderated

    Hi.

     

    Now, swapped out motherboard, with a completely different, less complex model that a friend had, handy - and even now, with no HDD's plugged in, booting Solaris 11.1 Live or text, I still hang at the npe0 error. This is making absolutely no sense. So now I don't suspect the hardware, but I *do* suspect that ORacle have done something to the compatibility with current generation Z77 series motherboards/Ivy Bridge IOCH/MPH controllers.

     

    Man, this is getting confusing .

     

    -jc

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points