1 Reply Latest reply on Apr 1, 2019 3:56 PM by Jonathan Giordano-Oracle

    Boot failing with UEK5 and HP Smart Array 410i RAID + fix

    638e61ed-075d-4b3f-bfad-066a5a92f43f

      I discovered recently that several of my HP DL360 G7 (old, I know) systems running OL 7.5 with Smart Array 410i RAID boards were unable to boot after upgrading to UEK5 from UEK4.

       

      My assumption was there was an issue with the HPSA driver included in UEK5 but discovered it’s actually caused by a patch to kernel/irq/affinity.c.  Before the patch, the Linux kernel assigned interrupt vectors to only present CPUs.  This is great unless you happen to have a system that supports hot pluggable CPUs (or virtual CPU scaling).  This patch assigns interrupt vectors to not just present CPUs but also ‘possible’ CPUs.

       

      Unfortunately, on certain systems, the allocation of interrupt vectors to possible CPUs returns a pointer to an empty array.  With no interrupt vectors to a device (such as an HP storage card) the kernel fails to recognize the device.  This is exactly what happened to my HPs.

       

      This bug was fixed in the mainline kernel awhile ago but has yet to be pulled into the UEK5 repository.

       

      Bug introduced into UEK5:

      https://github.com/oracle/linux-uek/commit/7b03e5dcc930cfdb3b081e57d90f6991ee52b359

       

      Fix in the Linux mainline:

      https://github.com/torvalds/linux/commit/0211e12dd0a5385ecffd3557bc570dbad7fcf245

       

      I patched the UEK5 kernel by hand with the Linux kernel commit, compiled, and tested it on an HP system that was failing.  It now works as expected.

       

      Will someone look into getting this patch integrated into UEK5?

       

      Thanks!

      John Grafton