9 Replies Latest reply: Jul 3, 2013 10:49 AM by Adhika W RSS

    Grid Infrastructure Does Not Start Cluster Resources

    Adhika W

      Hello Gurus,

       

      I configured a 2 node RAC cluster using VirtualBox.It has been running fine all along and each time I started one of the nodes, I will definitely see all of other Cluster Resources will be started eventually.

      However, after I left it untouched for a month (VM is stopped), I found out that after starting up the machine, only local resource which is ONLINE.

      This is what I get:

       

      [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t

      --------------------------------------------------------------------------------

      NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

      --------------------------------------------------------------------------------

      Local Resources

      --------------------------------------------------------------------------------

      ora.CRS.dg

                     ONLINE  ONLINE       oel6-112-rac1                               

      ora.DATADG.dg

                     ONLINE  ONLINE       oel6-112-rac1                               

      ora.FRADG.dg

                     ONLINE  ONLINE       oel6-112-rac1                               

      ora.LISTENER.lsnr

                     OFFLINE OFFLINE      oel6-112-rac1                               

      ora.asm

                     ONLINE  ONLINE       oel6-112-rac1            Started            

      ora.gsd

                     OFFLINE OFFLINE      oel6-112-rac1                               

      ora.net1.network

                     ONLINE  ONLINE       oel6-112-rac1                               

      ora.ons

                     ONLINE  ONLINE       oel6-112-rac1                               

      --------------------------------------------------------------------------------

      Cluster Resources

      --------------------------------------------------------------------------------

      ora.LISTENER_SCAN1.lsnr

            1        OFFLINE OFFLINE                                                  

      ora.LISTENER_SCAN2.lsnr

            1        OFFLINE OFFLINE                                                  

      ora.LISTENER_SCAN3.lsnr

            1        OFFLINE OFFLINE                                                  

      ora.cvu

            1        OFFLINE OFFLINE                                                  

      ora.oc4j

            1        OFFLINE OFFLINE                                                  

      ora.oel6-112-rac1.vip

            1        OFFLINE OFFLINE                                                  

      ora.oel6-112-rac2.vip

            1        OFFLINE OFFLINE                                                  

      ora.racdb.db

            1        OFFLINE OFFLINE                               Instance Shutdown  

            2        OFFLINE OFFLINE                                                  

      ora.scan1.vip

            1        OFFLINE OFFLINE                                                  

      ora.scan2.vip

            1        OFFLINE OFFLINE                                                  

      ora.scan3.vip

            1        OFFLINE OFFLINE                                               

       

      and these are my other resources

       

      [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t -init

      --------------------------------------------------------------------------------

      NAME           TARGET  STATE        SERVER                   STATE_DETAILS    

      --------------------------------------------------------------------------------

      Cluster Resources

      --------------------------------------------------------------------------------

      ora.asm

            1        ONLINE  ONLINE       oel6-112-rac1            Started          

      ora.cluster_interconnect.haip

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.crf

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.crsd

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.cssd

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.cssdmonitor

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.ctssd

            1        ONLINE  ONLINE       oel6-112-rac1            ACTIVE:0         

      ora.diskmon

            1        OFFLINE OFFLINE                                                

      ora.evmd

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.gipcd

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.gpnpd

            1        ONLINE  ONLINE       oel6-112-rac1                             

      ora.mdnsd

            1        ONLINE  ONLINE       oel6-112-rac1                       

       

      Where do I supposed to check to see why the Cluster Resource like SCAN Listener, Database and etc not running?

      I've been checking on the logs but I haven't figured out what I should be looking at.

      Can some body help me?

       

      Thank you in advanced,

      Adhika

        • 1. Re: Grid Infrastructure Does Not Start Cluster Resources
          FreddieEssex

          Check your logfiles in $GRID_HOME/log/`hostname`.

           

          There will be a number of logfiles.  Check the alert log and the other log files and you will find the source of your issues.

          • 2. Re: Grid Infrastructure Does Not Start Cluster Resources
            Adhika W

            Thanks Freddie,

             

            I've done that before I posted the question here.

            The reason I posted the question here is to ask for a guide where or which file to look for the issue.

             

            Regards,

            Adhika

            • 3. Re: Grid Infrastructure Does Not Start Cluster Resources
              FreddieEssex

              The issue could be in any of the files.....I know you say that you've done that, but I reckon you've missed the erros in the logfile.

               

              The alert log would be a good place to start and drill down in to the other logfiles as well.

               

              I don't think anyone could answer your question of which specific logfile will contain the errors.  It will more likely be more than 1 logfile which will contain errors.

               

              Seek and ye shall find....

              • 4. Re: Grid Infrastructure Does Not Start Cluster Resources
                Adhika W

                Hi Freddie,

                 

                 

                I saw these lines on that Clusterware alert log:

                 

                2013-07-01 22:39:20.084

                [crsd(3338)]CRS-1012:The OCR service started on node oel6-112-rac1.

                2013-07-01 22:39:20.208

                [evmd(3145)]CRS-1401:EVMD started on node oel6-112-rac1.

                2013-07-01 22:39:21.549

                [crsd(3338)]CRS-1201:CRSD started on node oel6-112-rac1.

                2013-07-01 22:39:22.715

                [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

                2013-07-01 22:39:22.728

                [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

                2013-07-01 22:39:22.772

                [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

                2013-07-01 22:39:22.811

                [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

                2013-07-01 22:39:23.069

                [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/opmn/bin/onsctli" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

                2013-07-01 22:39:23.567

                [crsd(3338)]CRS-2772:Server 'oel6-112-rac1' has been assigned to pool 'Generic'.

                2013-07-01 22:39:23.568

                [crsd(3338)]CRS-2772:Server 'oel6-112-rac1' has been assigned to pool 'ora.racdb'.

                 

                The I started looking in the /u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log file and found out that at the same time (2013-07-01 22:39:22),

                I saw the following lines:

                2013-07-01 22:39:22.433: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn connection failure 1

                2013-07-01 22:39:22.433: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn failed CRS-5000: Expected resource ora.asm does not exist in agent process

                 

                2013-07-01 22:39:22.434: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn try getInstanceInforWhenASMFail

                2013-07-01 22:39:22.434: [ora.CRS.dg][1644160768] {1:13152:2} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 0

                 

                But that does not prevent the asm from being started properly.

                The only local resource that didn't start up automatically was the LISTENER.

                 

                The following command shows that the local LISTENER has hard dependency on ora.cluster_vip_net1.type which is the ora.oel6-112-rac1.vip

                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.LISTENER.lsnr -p | grep -ie dependencies

                START_DEPENDENCIES=hard(type:ora.cluster_vip_net1.type) pullup(type:ora.cluster_vip_net1.type)

                STOP_DEPENDENCIES=hard(intermediate:type:ora.cluster_vip_net1.type)

                 

                NAME=ora.oel6-112-rac1.vip

                TYPE=ora.cluster_vip_net1.type

                START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network)

                STOP_DEPENDENCIES=hard(ora.net1.network)

                 

                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.net1.network -p | grep -ie dependencies

                START_DEPENDENCIES=

                STOP_DEPENDENCIES=

                 

                The ora.net1.network resource started properly and I didn't see that this prevent the ora.oel6-112-rac1.vip from starting up.

                 

                The following lines also show that the ora.asm resource is has a weak dependency only against the ora.LISTENER.lsnr

                 

                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.racdb.db -p | grep -ie dependencies

                START_DEPENDENCIES=hard(ora.DATADG.dg,ora.FRADG.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns) pullup(ora.DATADG.dg,ora.FRADG.dg)

                STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATADG.dg,shutdown:ora.FRADG.dg)

                 

                 

                 

                 


                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.CRS.dg -p | grep -ie dependenci

                START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

                STOP_DEPENDENCIES=hard(intermediate:ora.asm)

                 

                 

                 

                 


                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.DATADG.dg -p | grep -ie dependencies

                START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

                STOP_DEPENDENCIES=hard(intermediate:ora.asm)

                 

                 

                 

                 


                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.FRADG.dg -p | grep -ie dependencies

                START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

                STOP_DEPENDENCIES=hard(intermediate:ora.asm)

                 

                 

                 

                 


                [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.asm -p | grep -ie dependencies

                START_DEPENDENCIES=weak(ora.LISTENER.lsnr)

                STOP_DEPENDENCIES=

                 

                I'm a little lost here.

                A suggestion would be very much appreciated.

                 

                 

                Thank you,

                Adhika

                • 5. Re: Grid Infrastructure Does Not Start Cluster Resources
                  Tushar Thakker

                  Dear Adhika,

                   

                  Can you please post result of "srvctl status nodeapps". Also please let us know if you are able to manually start SCAN, LISTENERs and DB using SRVCTL? If you are facing errors in that then we can understand that it is trying to start automatically as well and failing. But if you are able to start manually then we can configure it accordingly.

                   

                  Regards

                  Tushar

                  • 6. Re: Grid Infrastructure Does Not Start Cluster Resources
                    Adhika W

                    Hello Tushar,

                     

                    This is the result of srvctl status nodeapps:

                    VIP oel6-112-rac1-vip is enabled
                    VIP oel6-112-rac1-vip is not running
                    VIP oel6-112-rac2-vip is enabled
                    VIP oel6-112-rac2-vip is not running
                    Network is enabled
                    Network is running on node: oel6-112-rac1
                    Network is not running on node: oel6-112-rac2
                    GSD is disabled
                    GSD is not running on node: oel6-112-rac1
                    GSD is not running on node: oel6-112-rac2
                    ONS is enabled
                    ONS daemon is running on node: oel6-112-rac1
                    ONS daemon is not running on node: oel6-112-rac2
                    
                    

                     

                    I can also start the rest of the resource successfully.

                     

                    Basically, I never done anything on the clusterware configuratioin side after the installation.

                    It was just all of a sudden, the resource starting only stuck up to the ASM level.

                     

                    Thank you very much for helping me,

                    Adhika

                    • 7. Re: Grid Infrastructure Does Not Start Cluster Resources
                      Tushar Thakker

                      Dear Adhika,

                       

                      Since your node 2 services are not starting properly, let us first check with node 1 only running. Start only node 1 and see the status of services. Since network is started on node 1 and VIP is not started, I am suspecting something wrong here only. See if both VIPs are starting on when you are running only Node 1 (node 2 being down). If not try starting VIPs using "srvctl start vip" and
                      "srvctl start scan" and "srvctl start scan_listener". Let me know the result and we will look further.

                       

                      Regards

                      Tushar

                      • 8. Re: Grid Infrastructure Does Not Start Cluster Resources
                        Adhika W

                        Hello Tushar,

                         

                        It was my intention to focus on node 1 first. Node 2 was always down.

                        However, If I did start the Node 2 only, the same issue also happening here.

                         

                        So, when I started only node 1, these are the services list:

                        [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t

                        --------------------------------------------------------------------------------

                        NAME           TARGET  STATE        SERVER                   STATE_DETAILS

                        --------------------------------------------------------------------------------

                        Local Resources

                        --------------------------------------------------------------------------------

                        ora.CRS.dg

                                       ONLINE  ONLINE       oel6-112-rac1

                        ora.DATADG.dg

                                       ONLINE  ONLINE       oel6-112-rac1

                        ora.FRADG.dg

                                       ONLINE  ONLINE       oel6-112-rac1

                        ora.LISTENER.lsnr

                                       OFFLINE OFFLINE      oel6-112-rac1

                        ora.asm

                                       ONLINE  ONLINE       oel6-112-rac1            Started

                        ora.gsd

                                       OFFLINE OFFLINE      oel6-112-rac1

                        ora.net1.network

                                       ONLINE  ONLINE       oel6-112-rac1

                        ora.ons

                                       ONLINE  ONLINE       oel6-112-rac1

                        --------------------------------------------------------------------------------

                        Cluster Resources

                        --------------------------------------------------------------------------------

                        ora.LISTENER_SCAN1.lsnr

                              1        OFFLINE OFFLINE

                        ora.LISTENER_SCAN2.lsnr

                              1        OFFLINE OFFLINE

                        ora.LISTENER_SCAN3.lsnr

                              1        OFFLINE OFFLINE

                        ora.cvu

                              1        OFFLINE OFFLINE

                        ora.oc4j

                              1        OFFLINE OFFLINE

                        ora.oel6-112-rac1.vip

                              1        OFFLINE OFFLINE

                        ora.oel6-112-rac2.vip

                              1        OFFLINE OFFLINE  

                        ora.racdb.db

                              1        OFFLINE OFFLINE                               Instance Shutdown

                              2        OFFLINE OFFLINE

                        ora.scan1.vip

                              1        OFFLINE OFFLINE

                        ora.scan2.vip

                              1        OFFLINE OFFLINE

                        ora.scan3.vip

                              1        OFFLINE OFFLINE



                         

                        [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t -init

                        --------------------------------------------------------------------------------

                        NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

                        --------------------------------------------------------------------------------

                        Cluster Resources

                        --------------------------------------------------------------------------------

                        ora.asm

                              1        ONLINE  ONLINE       oel6-112-rac1            Started

                        ora.cluster_interconnect.haip

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.crf

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.crsd

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.cssd

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.cssdmonitor

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.ctssd

                              1        ONLINE  ONLINE       oel6-112-rac1            ACTIVE:0

                        ora.diskmon

                              1        OFFLINE OFFLINE

                        ora.evmd

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.gipcd

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.gpnpd

                              1        ONLINE  ONLINE       oel6-112-rac1

                        ora.mdnsd

                              1        ONLINE  ONLINE       oel6-112-rac1

                         

                        That's what I will get no matter how long I waited, it won't start the rest of the services.

                        Starting up the services manually does not give any issue at all. it was started successfully.

                         

                        There should be somewhere to look for this.

                        Thanks for replying Tushar.

                         

                        Regards,

                        Adhika

                        • 9. Re: Grid Infrastructure Does Not Start Cluster Resources
                          Adhika W

                          in addition to that, I saw some thing weird in the $GRID_HOME/log/<hostname>/agent/crsd/orarootagent_root/orarootagent_root.log

                           

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] ifname=eth0

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] subnetmask=255.255.255.0

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] subnetnumber=192.168.1.0

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] InterfaceName = eth0

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] HostName oel6-112-rac1-vip translated to 192.168.1.75

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] Interface Name = eth0

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] Ip Address = 192.168.1.75

                          [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] VipAgent::checkIp returned false

                          [    AGFW][67098368] {1:15932:2} ora.oel6-112-rac1.vip 1 1 state changed from: UNKNOWN to: OFFLINE

                          [    AGFW][67098368] {1:15932:2} Agent sending last reply for: RESOURCE_PROBE[ora.oel6-112-rac1.vip 1 1] ID 4097:110

                          [    AGFW][67098368] {1:15932:2} Agent received the message: RESOURCE_DELETE[ora.oel6-112-rac1.vip 1 1] ID 4358:152

                          [    AGFW][67098368] {1:15932:2} Agent sending last reply for: RESOURCE_DELETE[ora.oel6-112-rac1.vip 1 1] ID 4358:152

                          [    AGFW][67098368] {1:15932:2} ora.oel6-112-rac1.vip 1 1 marked as deleted.

                           

                          Hope this give something to be investigated on.

                           

                          Thank you,

                          Adhika