We have been upgrading many of our 2-node Solaris and RHEL DB clusters from 10gR2 to 11gR2 (18.104.22.168.0). We first upgrade the clusterware to Grid Infrastructure and then upgrade the DB that reside on those clusters. We have had good success on our Solaris platforms. On the RHEL upgrades, it seems that we are consistently failing when running the root.sh script on the 2nd node. Running root.sh on the first node always yields CRS upgrade success, but root.sh on the 2nd node always reports that the crs/ocr keys for the 2nd node cannot be found. We later notice the OCR has been corrupted during the upgrade and, in some case, the OCR has been removed. Once we back out the failed attempt and recreate the OCR, the subsequest attempt to upgrade the clusterware often succeeds on both nodes.
A small bit of history on the RHEL platforms. These were built on a RHEL4 platofrom many years ago and have been fine. About one year ago, the systeam admins upgraded the OS to RHEL5. THe 10gR2 DBs ran fine on this new OS version. Now we are upgrading Oracle to 11gR2 on these RHEL5 DB servers. So here is (are) my high-level question(s) OR TWO:
1.) Are there any known issues tied to 11gR2 upgrades specifically on RHEL5 that involved any type of OCR corruptions? Seems odd that we always have success on Solaris, but never on RHEL5.
2.) In a few of our cases, we noticed that the public or private interfaces were incorrect and/or had to be rebuilt via the "oifcfg -delif" and "oifcfg -setif" commands. Could this type of misconfiguration cause OCR corruption? I am assuming that OCR corrption could possibly cause the type of "cluster/ocr keys not found" messaages that we have seen on occasion.