4 Replies Latest reply: Apr 14, 2013 5:00 PM by 821215 RSS

    Root.sh failed on node 2 11202 GI RHEL 5.5

    831039
      Hi Gurus,

      I 'm strugging with the installation of 11202 GI ,the root.sh failed on the 2nd node with the below output.

      {The following environment variables are set as:

          ORACLE_OWNER= vpacrs

          ORACLE_HOME=  /vpa01/app/11.2.0/grid



      Enter the full pathname of the local bin directory: [/usr/local/bin]:

      The contents of "dbhome" have not changed. No need to overwrite.

      The contents of "oraenv" have not changed. No need to overwrite.

      The contents of "coraenv" have not changed. No need to overwrite.



      Entries will be added to the /etc/oratab file as needed by

      Database Configuration Assistant when a database is created

      Finished running generic part of root script.

      Now product-specific root actions will be performed.

      Using configuration parameter file: /vpa01/app/11.2.0/grid/crs/install/crsconfig_params

      CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hwvpa6'

      CRS-2676: Start of 'ora.cssdmonitor' on 'hwvpa6' succeeded

      CRS-2672: Attempting to start 'ora.cssd' on 'hwvpa6'

      CRS-2672: Attempting to start 'ora.diskmon' on 'hwvpa6'

      CRS-2676: Start of 'ora.diskmon' on 'hwvpa6' succeeded

      CRS-2676: Start of 'ora.cssd' on 'hwvpa6' succeeded



      tart of resource "ora.ctssd" failed

      CRS-2672: Attempting to start 'ora.ctssd' on 'hwvpa6'

      CRS-2674: Start of 'ora.ctssd' on 'hwvpa6' failed

      CRS-4000: Command Start failed, or completed with errors.

      Cluster Time Synchronisation Service start in exclusive mode failed at /vpa01/app/11.2.0/grid/crs/install/crsconfig_lib

      .pm line 6456.

      /vpa01/app/11.2.0/grid/perl/bin/perl -I/vpa01/app/11.2.0/grid/perl/lib -I/vpa01/app/11.2.0/grid/crs/install /vpa01/app/1

      1.2.0/grid/crs/install/rootcrs.pl execution failed

      [root@hwvpa6 init.d]#

      }


      ocssd.log
      {
      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:ASMDATA17:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:ASMDATA18:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:ASMDATA19:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:ASMDATA20:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:OCR_VOTE01:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:OCR_VOTE02:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]Fetching asmlib disk :ORCL:OCR_VOTE03:

      2011-01-12 16:34:42.727: [   SKGFD][1108093248]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
      )
      2011-01-12 16:34:42.727: [   SKGFD][1108093248]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
      )
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmvDiskVerify: Successful discovery of 0 disks
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmvFindInitialConfigs: No voting files found
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmCompleteVFDiscovery: Completing voting file discovery
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmvVerifyCommittedConfigVFs: Insufficient voting files found, found 0 of 0 configured, needed 1 voting files
      2011-01-12 16:34:42.728: [    CSSD][1108093248]clssnmCompleteVFDiscovery: Committed configuration for CIN 0:1294868082:0
      2011-01-12 16:34:42.728: [    CSSD][1108093248] misscount 30 reboot latency 3
      2011-01-12 16:34:42.728: [    CSSD][1108093248] long I/O timeout 200 short I/O timeout 27
      2011-01-12 16:34:42.728: [    CSSD][1108093248] diagnostic wait 0 active version 11.2.0.2.0
      2011-01-12 16:34:42.728: [    CSSD][1108093248] Listing unique IDs for 0 voting files:
      2011-01-12 16:34:42.728: [    CSSD][1825119520]clssnmOpenGIPCEndp: opening cluster listener on gipc://hwvpa6:nm_hwvpa-cluster
      2011-01-12 16:34:42.728: [GIPCHGEN][1825119520] gipchaInternalRegister: Initializing HA GIPC

      }

      ctssd.log

      {Oracle Database 11g Clusterware Release 11.2.0.2.0 - Production Copyright 1996, 2010 Oracle. All rights reserved.
      2011-01-12 16:34:53.178: [    CTSS][3926195488]Oracle Database CTSS Release 11.2.0.2.0 Production Copyright 2006, 2007 Oracle. All rights reserved.
      2011-01-12 16:34:53.178: [    CTSS][3926195488]ctss_scls_init: SCLs Context is 0x64b5ce0
      [  clsdmt][1095989568]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=hwvpa6DBG_CTSSD))
      2011-01-12 16:34:53.180: [  clsdmt][1095989568]PID for the Process [5374], connkey 11
      2011-01-12 16:34:53.180: [  clsdmt][1095989568]Creating PID [5374] file for home /vpa01/app/11.2.0/grid host hwvpa6 bin ctss to /vpa01/app/11.2.0/grid/ctss/init/
      2011-01-12 16:34:53.180: [  clsdmt][1095989568]Writing PID [5374] to the file [vpa01/app/11.2.0/grid/ctss/init/hwvpa6.pid]
      2011-01-12 16:34:54.132: [    CTSS][1095989568]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x0], offset[0 ms]}, length=[8].
      2011-01-12 16:34:54.136: [    CTSS][3926195488]ctss_css_init: CSS Context is 0x658d8f0
      2011-01-12 16:34:54.136: [    CTSS][3926195488]ctss_init: CTSS production mode
      2011-01-12 16:34:54.136: [    CTSS][3926195488]ctss_init: Env var CTSS_REBOOT is undefined or contains non-boolean value. Ignoring CTSS_REBOOT.
      2011-01-12 16:34:54.136: [    CTSS][3926195488]sclsctss_gvss1: NTP default config file found
      2011-01-12 16:34:54.137: [    CTSS][3926195488]sclsctss_gvss8: Return [0] and NTP status [2].
      2011-01-12 16:34:54.137: [    CTSS][3926195488]ctss_check_vendor_sw: Vendor time sync software is detected. status [2].
      2011-01-12 16:34:54.137: [    CTSS][3926195488]ctss_check_vendor_sw: Ctssd is switching to observer role
      2011-01-12 16:34:54.137: [    GIPC][3926195488] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 687], original from [clsss.c : 5325]
      [ default][3926195488]Failure 4 in trying to open SV key SYSTEM.version.localhost

      [ default][3926195488]procr_open_key error 4 errorbuf : PROCL-4: The local registry key to be operated on does not exist.

      2011-01-12 16:34:54.139: [    CTSS][3926195488]clsctss_r_av2: Error [3] retrieving Active Version from OLR. Returns [19].
      2011-01-12 16:34:54.141: [    CTSS][3926195488](:ctss_init16:): Error [19] retrieving active version. Returns [19].
      2011-01-12 16:34:54.141: [    CTSS][3926195488]ctss_main: CTSS init failed [19]
      2011-01-12 16:34:54.141: [    CTSS][3926195488]ctss_main: CTSS daemon aborting [19].
      2011-01-12 16:34:54.141: [    CTSS][3926195488]CTSS daemon aborting
      2011-01-12 16:34:55.134: [    CTSS][1095989568]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      2011-01-12 16:34:56.136: [    CTSS][1095989568]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      2011-01-12 16:34:57.137: [    CTSS][1095989568]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      2011-01-13 06:46:35.319: [    CTSS][2359520544]Oracle Database CTSS Release 11.2.0.2.0 Production Copyright 2006, 2007 Oracle. All rights reserved.
      2011-01-13 06:46:35.319: [    CTSS][2359520544]ctss_scls_init: SCLs Context is 0x28eece0
      [  clsdmt][1082968384]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=hwvpa6DBG_CTSSD))
      2011-01-13 06:46:35.321: [  clsdmt][1082968384]PID for the Process [17832], connkey 11
      2011-01-13 06:46:35.321: [  clsdmt][1082968384]Creating PID [17832] file for home /vpa01/app/11.2.0/grid host hwvpa6 bin ctss to /vpa01/app/11.2.0/grid/ctss/init/
      2011-01-13 06:46:35.321: [  clsdmt][1082968384]Writing PID [17832] to the file [vpa01/app/11.2.0/grid/ctss/init/hwvpa6.pid]
      2011-01-13 06:46:36.274: [    CTSS][1082968384]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x0], offset[0 ms]}, length=[8].
      2011-01-13 06:46:36.279: [    CTSS][2359520544]ctss_css_init: CSS Context is 0x29c68a0
      2011-01-13 06:46:36.279: [    CTSS][2359520544]ctss_init: CTSS production mode
      2011-01-13 06:46:36.279: [    CTSS][2359520544]ctss_init: Env var CTSS_REBOOT is undefined or contains non-boolean value. Ignoring CTSS_REBOOT.
      2011-01-13 06:46:36.279: [    CTSS][2359520544]sclsctss_gvss1: NTP default config file found
      2011-01-13 06:46:36.279: [    CTSS][2359520544]sclsctss_gvss8: Return [0] and NTP status [2].
      2011-01-13 06:46:36.279: [    CTSS][2359520544]ctss_check_vendor_sw: Vendor time sync software is detected. status [2].
      2011-01-13 06:46:36.279: [    CTSS][2359520544]ctss_check_vendor_sw: Ctssd is switching to observer role
      2011-01-13 06:46:36.280: [    GIPC][2359520544] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 687], original from [clsss.c : 5325]
      [ default][2359520544]Failure 4 in trying to open SV key SYSTEM.version.localhost

      [ default][2359520544]procr_open_key error 4 errorbuf : PROCL-4: The local registry key to be operated on does not exist.

      2011-01-13 06:46:36.282: [    CTSS][2359520544]clsctss_r_av2: Error [3] retrieving Active Version from OLR. Returns [19].
      2011-01-13 06:46:36.285: [    CTSS][2359520544](:ctss_init16:): Error [19] retrieving active version. Returns [19].
      2011-01-13 06:46:36.285: [    CTSS][2359520544]ctss_main: CTSS init failed [19]
      2011-01-13 06:46:36.285: [    CTSS][2359520544]ctss_main: CTSS daemon aborting [19].
      2011-01-13 06:46:36.285: [    CTSS][2359520544]CTSS daemon aborting
      2011-01-13 06:46:37.278: [    CTSS][1082968384]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      2011-01-13 06:46:38.281: [    CTSS][1082968384]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      2011-01-13 06:46:39.284: [    CTSS][1082968384]ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x22], offset[0 ms]}, length=[8].
      }

      I made the changes also as per note ML 1050164.1 and re-ran the root.sh but same error.


      Thanks

      Edited by: 828036 on Jan 13, 2011 1:12 PM

      Edited by: 828036 on Jan 13, 2011 1:13 PM
        • 1. Re: Root.sh failed on node 2 11202 GI RHEL 5.5
          Levi Pereira
          Hi,

          Sorry i dont read the subject...

          But you can try this...

          *Troubleshooting 11.2 Grid Infastructure Installation Root.sh Issues [ID 1053970.1]*

          Regards,
          Levi Pereira

          Edited by: Levi Pereira on Jan 13, 2011 8:14 PM
          • 2. Re: Root.sh failed on node 2 11202 GI RHEL 5.5
            580716
            See following errors while discovering the ASM disks on the node:

            2011-01-12 16:34:42.727: [ SKGFD][1108093248]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted
            )
            2011-01-12 16:34:42.727: [ SKGFD][1108093248]ERROR: -15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not permitted

            you may like to check why second node is not able to do discovery of the ASM disks. Check output of the following command on the failed node:

            /usr/sbin/oracleasm-discover ORCL:*

            It should display the configured ASM disks.

            Harish Kumar
            • 3. Re: Root.sh failed on node 2 11202 GI RHEL 5.5
              Levi Pereira
              Hi,

              ASM disks are located on multipath devices?

              If yes...

              Check this note..

              *11gR2 CRS doesn't startup after node reboot [ID 1050164.1]*

              or

              http://abiliusta.blogspot.com/2010/03/ora-15186-asmlib-mesg-operation-not.html


              Regards,
              Levi Pereira
              • 4. Re: Root.sh failed on node 2 11202 GI RHEL 5.5
                821215
                I encountered a very similar issue while upgrading to 11.2.0.3.

                Although this thread is very old, I'm posting this here in case it is of benefit to others, as I was unable to find anything useful about this problem online or in the MoS portal.

                ----------------
                Symptoms:
                rootupgrade.sh fails with the following errors:

                Replacing Clusterware entries in inittab
                Start of resource "ora.ctssd" failed
                CRS-2672: Attempting to start 'ora.ctssd' on 'servername03'
                CRS-2674: Start of 'ora.ctssd' on 'servername03' failed
                CRS-4000: Command Start failed, or completed with errors.
                Failed to start Oracle Grid Infrastructure stack
                Failed to start Cluster Time Synchronisation Service - CTSS at /u01/11.2.0.3/grid/crs/install/crsconfig_lib.pm line 1221.
                /u01/11.2.0.3/grid/perl/bin/perl -I/u01/11.2.0.3/grid/perl/lib -I/u01/11.2.0.3/grid/crs/install /u01/11.2.0.3/grid/crs/install/rootcrs.pl execution failed

                octssd.log file shows error similar to:
                --------------------------------------------------------------
                2013-04-13 20:27:31.649: [ default][2535589600]clsvactversion:4: Retrieving Active Version from local storage.
                2013-04-13 20:27:31.652: [    CTSS][2535589600]clsctss_r_av3: Invalid active version [] retrieved from OLR. Returns [19].
                2013-04-13 20:27:31.653: [    CTSS][2535589600](:ctss_init16:): Error [19] retrieving active version. Returns [19].
                2013-04-13 20:27:31.653: [    CTSS][2535589600]ctss_main: CTSS init failed [19]
                2013-04-13 20:27:31.654: [    CTSS][2535589600]ctss_main: CTSS daemon aborting [19].
                2013-04-13 20:27:31.654: [    CTSS][2535589600]CTSS daemon aborting

                --------------------------------------------------------------
                BUT “crsctl query crs activeversion” works fine

                Cause:
                OLR is corrupt on this node, but OCR is fine.
                [root@servername03 ~]# ocrcheck -local
                Status of Oracle Local Registry is as follows :
                Version : 3
                Total space (kbytes) : 262120
                Used space (kbytes) : 2608
                Available space (kbytes) : 259512
                ID : 1880313695
                Device/File Name : /u01/11.2.0.3/grid/cdata/servername03.olr
                Device/File integrity check succeeded

                Local registry integrity check succeeded

                Logical corruption check failed

                Solution:
                IF this occurs prior to the ActiveVersion being changed by the upgrader, you can try to restore a backup of the OLR

                [root@servername03 ~]# /u01/11.2.0.3/grid/bin/ocrconfig -local -showbackup
                [root@servername03 ~]# /u01/11.2.0.3/grid/bin/ocrconfig -local -restore /u01/11.2.0/grid/cdata/servername03/backup_20110508_173907.olr ## Substitute with one of the backups above

                Restart CRS (crsctl start crs) from the old GRID_HOME and rerun rootupgrade.sh. It should resume where it left off and continue the upgrade.