12 Replies Latest reply on Feb 21, 2008 12:40 AM by 624329

    Not able to start instance using srvctl

    537688
      I am not able to start one of my instance using srvctl.

      srvctl start instance -d prmand -i prmand1
      PRKP-1001 : Error starting instance PRMAND1 on node auohsrman01
      CRS-0215: Could not start resource 'ora.PRMAND.PRMAND1.inst'.

      Where as same instance I can start from sqlplus.

      This is 10.2.0.3 RAC on Linux.

      Here is crs_stat info

      crs_stat -t
      Name Type Target State Host
      ------------------------------------------------------------
      ora....D1.inst application ONLINE UNKNOWN auohsrman01
      ora....D2.inst application ONLINE ONLINE auohsrman02
      ora.PRMAND.db application OFFLINE OFFLINE
      ora....n01.gsd application ONLINE ONLINE auohsrman01
      ora....n01.ons application ONLINE ONLINE auohsrman01
      ora....n01.vip application ONLINE ONLINE auohsrman01
      ora....n02.gsd application ONLINE ONLINE auohsrman02
      ora....n02.ons application ONLINE ONLINE auohsrman02
      ora....n02.vip application ONLINE ONLINE auohsrman02


      for Instance 1 it gives UNKNOWN. also ora.PRMAND.db application OFFLINE OFFLINE shows offline.

      Please help me .
        • 1. Re: Not able to start instance using srvctl
          ViragSharma
          Check cssd and CRS log file for error in CRS_HOME/log , of course post those contents.

          # Virag Sharma
          • 2. Re: Not able to start instance using srvctl
            537688
            Here is the logs.

            cssdOUT.log

            setsid: failed with -1/1
            calling getpwnam_r (ororacrs)
            completed getpwnam_r (ororacrs)
            2007-06-21 00:38
            ==========

            ocssd.log
            =========
            [    CSSD]2007-06-21 23:48:12.540 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x2a98104550) proc(0x2a98106cc0) pid() proto(10:2:1:1)
            [    CSSD]2007-06-21 23:49:12.885 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x7a1e30) proc(0x79c3a0) pid() proto(10:2:1:1)
            [    CSSD]2007-06-21 23:50:13.240 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x79a570) proc(0x79c2d0) pid() proto(10:2:1:1)
            [    CSSD]2007-06-21 23:51:13.589 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x7a4520) proc(0x79c2d0) pid() proto(10:2:1:1)
            [    CSSD]2007-06-21 23:52:13.932 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x7a4520) proc(0x79c2d0) pid() proto(10:2:1:1)
            [    CSSD]2007-06-21 23:53:14.294 [1210108256] >TRACE: clssgmClientConnectMsg: Connect from con(0x79a570) proc(0x79c2d0) pid() proto(10:2:1:1)

            ========
            2007-06-21 08:03:31.635: [  CRSRES][1541613920]0Stop of `ora.auohsrman01.vip` on member `auohsrman01` succeeded.
            2007-06-21 08:03:31.671: [  CRSRES][1543715168]0Stop of `ora.auohsrman02.vip` on member `auohsrman02` succeeded.
            2007-06-21 08:03:49.717: [  CRSRES][1539512672]0Stop of `ora.PRMAND.PRMAND2.inst` on member `auohsrman02` succeeded.
            2007-06-21 08:04:35.212: [  CRSRES][1539512672]0Attempting to start `ora.PRMAND.PRMAND2.inst` on member `auohsrman02`
            2007-06-21 08:04:35.224: [  CRSRES][1541613920]0startRunnable: setting CLI values
            2007-06-21 08:04:35.225: [  CRSRES][1541613920]0Attempting to start `ora.auohsrman01.vip` on member `auohsrman01`
            2007-06-21 08:04:35.241: [  CRSRES][1543715168]0Attempting to start `ora.auohsrman02.vip` on member `auohsrman02`
            2007-06-21 08:04:38.650: [  CRSRES][1541613920]0Start of `ora.auohsrman01.vip` on member `auohsrman01` succeeded.
            2007-06-21 08:04:38.680: [  CRSRES][1543715168]0Start of `ora.auohsrman02.vip` on member `auohsrman02` succeeded.
            2007-06-21 08:04:55.589: [  CRSRES][1539512672]0Start of `ora.PRMAND.PRMAND2.inst` on member `auohsrman02` succeeded.
            2007-06-21 08:04:55.627: [  CRSRES][1545816416]0CRS-1002: Resource 'ora.auohsrman02.ons' is already running on member 'auohsrman02'

            2007-06-21 08:04:55.663: [  CRSRES][1503840608]0startRunnable: setting CLI values
            2007-06-21 08:04:55.663: [  CRSRES][1503840608]0Attempting to start `ora.auohsrman01.ons` on member `auohsrman01`
            2007-06-21 08:04:55.716: [  CRSRES][1541613920]0startRunnable: setting CLI values
            2007-06-21 08:04:55.717: [  CRSRES][1541613920]0Attempting to start `ora.auohsrman01.gsd` on member `auohsrman01`
            2007-06-21 08:04:55.746: [  CRSRES][1543715168]0Attempting to start `ora.auohsrman02.gsd` on member `auohsrman02`
            2007-06-21 08:04:56.078: [  CRSRES][1543715168]0Start of `ora.auohsrman02.gsd` on member `auohsrman02` succeeded.
            2007-06-21 08:04:56.106: [  CRSRES][1541613920]0Start of `ora.auohsrman01.gsd` on member `auohsrman01` succeeded.
            2007-06-21 08:04:57.158: [  CRSRES][1503840608]0Start of `ora.auohsrman01.ons` on member `auohsrman01` succeeded.
            =========
            crsd.log
            =======
            t redirection failed for `/oracrs/oracle/product/102/crs/log/startRBmvH9.stdout` : No such file or directory
            2007-06-21 07:38:25.140: [  CRSEVT][1516431712]0CAAMonitorHandler :: 0:Action Script for resource 'ora.auohsrman01.vip' stdout redirection failed for `/oracrs/oracle/product/102/crs/log/startRBmvH9.stdout` : No such file or directory
            2007-06-21 07:39:15.099: [  CRSRES][1516431712]0Resource ora.PRMAND.PRMAND1.inst has been moved out of UNKNOWN into OFFLINE
            2007-06-21 07:39:15.121: [  CRSRES][1503840608]0startRunnable: setting CLI values
            2007-06-21 07:39:15.136: [  CRSRES][1503840608]0Attempting to start `ora.PRMAND.PRMAND1.inst` on member `auohsrman01`
            2007-06-21 07:39:15.180: [  CRSAPP][1503840608]0StartResource error for ora.PRMAND.PRMAND1.inst error code = 1
            2007-06-21 07:39:15.243: [  CRSAPP][1503840608]0StopResource error for ora.PRMAND.PRMAND1.inst error code = 1
            2007-06-21 07:39:15.248: [  CRSRES][1503840608]0X_OP_StopResourceFailed : Stop Resource failed
            (File: rti.cpp, line: 1796

            2007-06-21 07:39:15.248: [  CRSRES][1503840608][ALERT]0`ora.PRMAND.PRMAND1.inst` on member `auohsrman01` has experienced an unrecoverable failure.
            2007-06-21 07:39:15.248: [  CRSRES][1503840608]0Human intervention required to resume its availability.
            • 3. Re: Not able to start instance using srvctl
              Harish Kumar
              Try to recreate the socket on problematic node as shown below ..lets see if this help you -

              Stop CRS on auohsrman01 and clean up all the socket and restart the CRS on auohsrman01

              srvctl stop nodeapps -n auohsrman01
              crsctl stop crs - as root
              rm /var/tmp/.oracle/* - as root
              crsctl start crs - as root

              Thanks & Regards
              • 4. Re: Not able to start instance using srvctl
                537688
                I have done recreation of socket, but that too didnt help.

                $ srvctl start instance -d prmand -i prmand1
                PRKP-1001 : Error starting instance PRMAND1 on node auohsrman01
                CRS-1028: Dependency analysis failed because of:
                CRS-0223: Resource 'ora.PRMAND.PRMAND1.inst' has placement error.

                $ srvctl status database -d prmand
                PRKO-2015 : Error in checking condition of instance on node: auohsrman01
                Instance PRMAND2 is running on node auohsrman02
                • 5. Re: Not able to start instance using srvctl
                  ViragSharma
                  What values you set for sqlnet.inbound_connect_timeout ?
                  In ORACLE_HOME/log/racg , you will find file imon*<dbname>.log , check that file for error.

                  ~Virag Sharma
                  • 6. Re: Not able to start instance using srvctl
                    537688
                    sqlnet.inbound_connect_timeout=600.

                    In ORACLE_HOME/log/racg, I couldnt see any logs imon*<dbname>.log .

                    bellow are the logs present.

                    default.log
                    ora.PRMAND.db.log
                    ora.auohsrman02.vip.log
                    ora.auohsrman01.vip.log
                    ora.auohsrman01.ons.log
                    • 7. Re: Not able to start instance using srvctl
                      ViragSharma
                      ons and vip look fine as per crs_stat output , So only thing remain to check ora.PRMAND.db.log for error.

                      Also check /etc/hosts has entry for all vip , hosts etc and metalink Note:311321.1
                      • 8. Re: Not able to start instance using srvctl
                        537688
                        Hi All,

                        Issue resolved. Problem was with permission and ownership at $ORACLE_HOME/log/racg, it was with root user for one node, later it changed to oracle user and restarted crs.

                        thanks for all your comments.
                        • 9. Re: Not able to start instance using srvctl
                          496244
                          How you changed owner and permision and on what directory? Is it on crs? Please give me steps
                          • 10. Re: Not able to start instance using srvctl
                            517299
                            I believe that the mysterious user stated that the ownership of the $ORACLE_HOME/log/racg (either as a directory or the files within). There were set to root but said user changed then to be owned by oracle. Will UserX please verify these steps? Thanks.

                            As root:
                            # cd $ORACLE_HOME
                            # ls
                            --- Is there a log directory? You may need to go into a crs or other named directory before you see log.

                            # cd log
                            # ls -lt | grep racg
                            --- Check the ownership of racg.
                            # ls -lt racg
                            --- Verify the ownership of the files within racg.

                            --- To change ownership (don't just type it because it's written here...you need to make sure you understand the ownership on other nodes.)

                            # chown -R oracle racg

                            --- This recursively changes the owner to the user oracle through all subdirectories of racg.

                            --- Restart crs
                            # crsctl start crs
                            • 11. Re: Not able to start instance using srvctl
                              517299
                              removed double post

                              Message was edited by:
                              digitalntburn
                              • 12. Re: Not able to start instance using srvctl
                                624329
                                digitalntburn,

                                Thanks for the very useful post; I was pulling my hair out trying to figure out why my CRS instances wouldn't start and wouldn't log any information anywhere. It turned out that various racg and log directories were owned by root and/or did not have write access by the dba group (we run instances under separate user IDs to 'oracle' which are in the dba group).

                                Sometimes I think CRS was deliberately designed to be nearly impossible to debug!

                                Annihilannic