6 Replies Latest reply on Jul 18, 2019 6:28 AM by Sunny kichloo

    CRS Startup Issue ==

    3519264

      Hi All,

       

      We had issue with 3 node RAC cluster where-in after the server reboot , the CRS is not coming up and throwing errors. We also opened a case with vendor with no real benefits . Here vendor suspected the issue with GRID user equivalence, but after fixing it the same problem persist. In order to troubleshoot , we ran the clufy and it resulted into errors. Also, OHASD logs showing some errors and I am not comprehend , if these errors related to lock files which was produced by clufy.

       

       

      Snipper from Clufy logs ==

       

      ./runcluvfy.sh stage -pre crsinst -n node1,node2,node3 -r -fixup -verbose

       

       

      Failures were encountered during execution of CVU verification request "stage -pre crsinst".

       

      Verifying User Mask ...FAILED

      node3: PRVF-7611 : Proper user file creation  mask (umask) for user

                     "grid" is not found on node "node3" [Expected = "0022" ;

                     Found = "0077"]

       

      node2: PRVF-7611 : Proper user file creation  mask (umask) for user

                     "grid" is not found on node "node2" [Expected = "0022" ;

                     Found = "0077"]

       

      node1: PRVF-7611 : Proper user file creation  mask (umask) for user

                     "grid" is not found on node "node1" [Expected = "0022" ;

                     Found = "0077"]

       

      Verifying Domain Sockets ...FAILED

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_GPNPD_node3_lock" exists on

                     node "node3".

      node3: PRVG-11750 : File "/var/tmp/.oracle/ora_gipc_node3_INIT"

                     exists on node "node3".

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_node3_INIT_lock" exists on

                     node "node3".

      node3: PRVG-11750 : File "/var/tmp/.oracle/npohasd" exists on node

                     "node3".

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_scls_opct_node3_lock" exists

                     on node "node3".

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_node3_EVMD_lock" exists on

                     node "node3".

      node3: PRVG-11750 : File "/var/tmp/.oracle/npohasd2" exists on node

                     "node3".

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_node3_MDNSD_lock" exists on

                     node "node3".

      node3: PRVG-11750 : File

                     "/var/tmp/.oracle/ora_gipc_node3_GPNPD_lock" exists on

                     node "node3".

       

      Snippet from OHASD logs ==

       

      2019-07-11 11:35:23.378 :CLSDYNAM:1744828160: [ora.evmd]{0:0:2} [start] DaemonAgent::start 110 returned UNPLANNED-OFFLINE/UNKNOWN

      2019-07-11 11:35:24.379 :CLSDYNAM:1740625664: [ora.mdnsd]{0:0:2} [start] DaemonAgent::start 030 clsdmCheck without returnbuf timeout:479998

      2019-07-11 11:35:24.380 :CLSDYNAM:1744828160: [ora.evmd]{0:0:2} [start] DaemonAgent::start 030 clsdmCheck without returnbuf timeout:479997

      2019-07-11 11:35:24.382 :  CLSDMC:1740625664: Connecting to ipc://node3_MDNSD

      2019-07-11 11:35:34.441 :CLSDYNAM:1744828160: [ora.evmd]{0:0:2} [start] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:

      2019-07-11 11:35:34.441 :CLSDYNAM:1744828160: [ora.evmd]{0:0:2} [start] DaemonAgent::clsdmCheck 040 sendMessage excp:

      2019-07-11 11:35:34.441 :CLSDYNAM:1744828160: [ora.evmd]{0:0:2} [start] DaemonAgent::start 110 returned UNPLANNED-OFFLINE/UNKNOWN

      2019-07-11 11:35:35.444 :CLSDYNAM:1740625664: [ora.mdnsd]{0:0:2} [start] DaemonAgent::clsdmCheck 040 sendMessage excp:

      2019-07-11 11:35:35.444 :CLSDYNAM:1740625664: [ora.mdnsd]{0:0:2} [start] DaemonAgent::start 110 returned UNPLANNED-OFFLINE/UNKNOWN

       

       

      Regards

        • 1. Re: CRS Startup Issue ==
          Sunny kichloo

          bug 9181189 is mentioned in below mentioned MOS link:

           

          CLUVFY Reports "Default user file creation mask check failed" (Doc ID 1051783.1)

           

          Do verify umask setting on all nodes as grid user manually, if the output is 22 or 022 or 0022, you can ignore this warning.

          • 2. Re: CRS Startup Issue ==
            3519264

            Thanks Sunny for your inputs , but , do you think this can cause CRS not to start on node 3. I am still clueless as why the CRS not starting up.

             

            Regards

            • 3. Re: CRS Startup Issue ==
              Sunny kichloo

              What was your umask?

               

              Was it 22 or 022 or 0022?

               

              Normally if usmask is correct then you can confirm if there is some error message on crs logs.

              • 4. Re: CRS Startup Issue ==
                3519264

                Hi Sunny,

                 

                Apologies for delayed response. Here , umask value is 0077 on all the 3 nodes. Following is the output. Vendor recommends to clean up following files , but that too didn't help in our case and we feel the issue resides somewhere else.

                 

                 

                == Vendor Suggestion ===

                 

                Remove <ORACLE_BASE>/crsdata/<node>/output/gipcdOUT.trc and <ORACLE_BASE>/crsdata/<node>/output/gipcd.pid and restart.

                 

                -bash-4.2$ umask

                0077

                -bash-4.2$ id

                uid=54322(grid) gid=662(oinstall) groups=662(oinstall),54322(dba),54323(oper),54324(backupdba),54325(dgdba),54326(kmdba),54327(asmdba),54328(asmoper),54329(asmadmin),54330(racdba) context=unconfined_u:unconfined_r:unconfined_t:s0

                -s0:c0.c1023

                -bash-4.2$ umask

                0077

                 

                Regards

                • 5. Re: CRS Startup Issue ==
                  3519264

                  Hi Sunny,

                   

                  Can this be because of file permission as I see lots of file have different file permission on $GRID_HOME/bin. Working node has changed file permission and node 3 has different where the issue exist.

                   

                  Regards

                  • 6. Re: CRS Startup Issue ==
                    Sunny kichloo

                    Yes file permission matters in Oracle RAC.This may create issue

                     

                    Do tell me one thing did you perform some activities before reboot, I mean like patching ???

                     

                    Also is your node 3 was working fine before reboot??