This discussion is archived
9 Replies Latest reply: Jul 3, 2013 8:49 AM by Adhika W RSS

Grid Infrastructure Does Not Start Cluster Resources

Adhika W Newbie
Currently Being Moderated

Hello Gurus,

 

I configured a 2 node RAC cluster using VirtualBox.It has been running fine all along and each time I started one of the nodes, I will definitely see all of other Cluster Resources will be started eventually.

However, after I left it untouched for a month (VM is stopped), I found out that after starting up the machine, only local resource which is ONLINE.

This is what I get:

 

[grid@oel6-112-rac1 bin]$ ./crsctl status resource -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.CRS.dg

               ONLINE  ONLINE       oel6-112-rac1                               

ora.DATADG.dg

               ONLINE  ONLINE       oel6-112-rac1                               

ora.FRADG.dg

               ONLINE  ONLINE       oel6-112-rac1                               

ora.LISTENER.lsnr

               OFFLINE OFFLINE      oel6-112-rac1                               

ora.asm

               ONLINE  ONLINE       oel6-112-rac1            Started            

ora.gsd

               OFFLINE OFFLINE      oel6-112-rac1                               

ora.net1.network

               ONLINE  ONLINE       oel6-112-rac1                               

ora.ons

               ONLINE  ONLINE       oel6-112-rac1                               

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        OFFLINE OFFLINE                                                  

ora.LISTENER_SCAN2.lsnr

      1        OFFLINE OFFLINE                                                  

ora.LISTENER_SCAN3.lsnr

      1        OFFLINE OFFLINE                                                  

ora.cvu

      1        OFFLINE OFFLINE                                                  

ora.oc4j

      1        OFFLINE OFFLINE                                                  

ora.oel6-112-rac1.vip

      1        OFFLINE OFFLINE                                                  

ora.oel6-112-rac2.vip

      1        OFFLINE OFFLINE                                                  

ora.racdb.db

      1        OFFLINE OFFLINE                               Instance Shutdown  

      2        OFFLINE OFFLINE                                                  

ora.scan1.vip

      1        OFFLINE OFFLINE                                                  

ora.scan2.vip

      1        OFFLINE OFFLINE                                                  

ora.scan3.vip

      1        OFFLINE OFFLINE                                               

 

and these are my other resources

 

[grid@oel6-112-rac1 bin]$ ./crsctl status resource -t -init

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS    

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.asm

      1        ONLINE  ONLINE       oel6-112-rac1            Started          

ora.cluster_interconnect.haip

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.crf

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.crsd

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.cssd

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.cssdmonitor

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.ctssd

      1        ONLINE  ONLINE       oel6-112-rac1            ACTIVE:0         

ora.diskmon

      1        OFFLINE OFFLINE                                                

ora.evmd

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.gipcd

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.gpnpd

      1        ONLINE  ONLINE       oel6-112-rac1                             

ora.mdnsd

      1        ONLINE  ONLINE       oel6-112-rac1                       

 

Where do I supposed to check to see why the Cluster Resource like SCAN Listener, Database and etc not running?

I've been checking on the logs but I haven't figured out what I should be looking at.

Can some body help me?

 

Thank you in advanced,

Adhika

  • 1. Re: Grid Infrastructure Does Not Start Cluster Resources
    FreddieEssex Pro
    Currently Being Moderated

    Check your logfiles in $GRID_HOME/log/`hostname`.

     

    There will be a number of logfiles.  Check the alert log and the other log files and you will find the source of your issues.

  • 2. Re: Grid Infrastructure Does Not Start Cluster Resources
    Adhika W Newbie
    Currently Being Moderated

    Thanks Freddie,

     

    I've done that before I posted the question here.

    The reason I posted the question here is to ask for a guide where or which file to look for the issue.

     

    Regards,

    Adhika

  • 3. Re: Grid Infrastructure Does Not Start Cluster Resources
    FreddieEssex Pro
    Currently Being Moderated

    The issue could be in any of the files.....I know you say that you've done that, but I reckon you've missed the erros in the logfile.

     

    The alert log would be a good place to start and drill down in to the other logfiles as well.

     

    I don't think anyone could answer your question of which specific logfile will contain the errors.  It will more likely be more than 1 logfile which will contain errors.

     

    Seek and ye shall find....

  • 4. Re: Grid Infrastructure Does Not Start Cluster Resources
    Adhika W Newbie
    Currently Being Moderated

    Hi Freddie,

     

     

    I saw these lines on that Clusterware alert log:

     

    2013-07-01 22:39:20.084

    [crsd(3338)]CRS-1012:The OCR service started on node oel6-112-rac1.

    2013-07-01 22:39:20.208

    [evmd(3145)]CRS-1401:EVMD started on node oel6-112-rac1.

    2013-07-01 22:39:21.549

    [crsd(3338)]CRS-1201:CRSD started on node oel6-112-rac1.

    2013-07-01 22:39:22.715

    [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

    2013-07-01 22:39:22.728

    [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

    2013-07-01 22:39:22.772

    [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

    2013-07-01 22:39:22.811

    [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

    2013-07-01 22:39:23.069

    [/u01/app/11203/grid/bin/oraagent.bin(3450)]CRS-5016:Process "/u01/app/11203/grid/opmn/bin/onsctli" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log"

    2013-07-01 22:39:23.567

    [crsd(3338)]CRS-2772:Server 'oel6-112-rac1' has been assigned to pool 'Generic'.

    2013-07-01 22:39:23.568

    [crsd(3338)]CRS-2772:Server 'oel6-112-rac1' has been assigned to pool 'ora.racdb'.

     

    The I started looking in the /u01/app/11203/grid/log/oel6-112-rac1/agent/crsd/oraagent_grid/oraagent_grid.log file and found out that at the same time (2013-07-01 22:39:22),

    I saw the following lines:

    2013-07-01 22:39:22.433: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn connection failure 1

    2013-07-01 22:39:22.433: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn failed CRS-5000: Expected resource ora.asm does not exist in agent process

     

    2013-07-01 22:39:22.434: [ora.CRS.dg][1644160768] {1:13152:2} [check] DgpAgent::getConnxn try getInstanceInforWhenASMFail

    2013-07-01 22:39:22.434: [ora.CRS.dg][1644160768] {1:13152:2} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 0

     

    But that does not prevent the asm from being started properly.

    The only local resource that didn't start up automatically was the LISTENER.

     

    The following command shows that the local LISTENER has hard dependency on ora.cluster_vip_net1.type which is the ora.oel6-112-rac1.vip

    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.LISTENER.lsnr -p | grep -ie dependencies

    START_DEPENDENCIES=hard(type:ora.cluster_vip_net1.type) pullup(type:ora.cluster_vip_net1.type)

    STOP_DEPENDENCIES=hard(intermediate:type:ora.cluster_vip_net1.type)

     

    NAME=ora.oel6-112-rac1.vip

    TYPE=ora.cluster_vip_net1.type

    START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network)

    STOP_DEPENDENCIES=hard(ora.net1.network)

     

    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.net1.network -p | grep -ie dependencies

    START_DEPENDENCIES=

    STOP_DEPENDENCIES=

     

    The ora.net1.network resource started properly and I didn't see that this prevent the ora.oel6-112-rac1.vip from starting up.

     

    The following lines also show that the ora.asm resource is has a weak dependency only against the ora.LISTENER.lsnr

     

    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.racdb.db -p | grep -ie dependencies

    START_DEPENDENCIES=hard(ora.DATADG.dg,ora.FRADG.dg) weak(type:ora.listener.type,global:type:ora.scan_listener.type,uniform:ora.ons,global:ora.gns) pullup(ora.DATADG.dg,ora.FRADG.dg)

    STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DATADG.dg,shutdown:ora.FRADG.dg)

     

     

     

     


    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.CRS.dg -p | grep -ie dependenci

    START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

    STOP_DEPENDENCIES=hard(intermediate:ora.asm)

     

     

     

     


    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.DATADG.dg -p | grep -ie dependencies

    START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

    STOP_DEPENDENCIES=hard(intermediate:ora.asm)

     

     

     

     


    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.FRADG.dg -p | grep -ie dependencies

    START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)

    STOP_DEPENDENCIES=hard(intermediate:ora.asm)

     

     

     

     


    [grid@oel6-112-rac1 bin]$ ./crsctl status resource ora.asm -p | grep -ie dependencies

    START_DEPENDENCIES=weak(ora.LISTENER.lsnr)

    STOP_DEPENDENCIES=

     

    I'm a little lost here.

    A suggestion would be very much appreciated.

     

     

    Thank you,

    Adhika

  • 5. Re: Grid Infrastructure Does Not Start Cluster Resources
    TusharThakker Explorer
    Currently Being Moderated

    Dear Adhika,

     

    Can you please post result of "srvctl status nodeapps". Also please let us know if you are able to manually start SCAN, LISTENERs and DB using SRVCTL? If you are facing errors in that then we can understand that it is trying to start automatically as well and failing. But if you are able to start manually then we can configure it accordingly.

     

    Regards

    Tushar

  • 6. Re: Grid Infrastructure Does Not Start Cluster Resources
    Adhika W Newbie
    Currently Being Moderated

    Hello Tushar,

     

    This is the result of srvctl status nodeapps:

    VIP oel6-112-rac1-vip is enabled
    VIP oel6-112-rac1-vip is not running
    VIP oel6-112-rac2-vip is enabled
    VIP oel6-112-rac2-vip is not running
    Network is enabled
    Network is running on node: oel6-112-rac1
    Network is not running on node: oel6-112-rac2
    GSD is disabled
    GSD is not running on node: oel6-112-rac1
    GSD is not running on node: oel6-112-rac2
    ONS is enabled
    ONS daemon is running on node: oel6-112-rac1
    ONS daemon is not running on node: oel6-112-rac2
    

     

    I can also start the rest of the resource successfully.

     

    Basically, I never done anything on the clusterware configuratioin side after the installation.

    It was just all of a sudden, the resource starting only stuck up to the ASM level.

     

    Thank you very much for helping me,

    Adhika

  • 7. Re: Grid Infrastructure Does Not Start Cluster Resources
    TusharThakker Explorer
    Currently Being Moderated

    Dear Adhika,

     

    Since your node 2 services are not starting properly, let us first check with node 1 only running. Start only node 1 and see the status of services. Since network is started on node 1 and VIP is not started, I am suspecting something wrong here only. See if both VIPs are starting on when you are running only Node 1 (node 2 being down). If not try starting VIPs using "srvctl start vip" and
    "srvctl start scan" and "srvctl start scan_listener". Let me know the result and we will look further.

     

    Regards

    Tushar

  • 8. Re: Grid Infrastructure Does Not Start Cluster Resources
    Adhika W Newbie
    Currently Being Moderated

    Hello Tushar,

     

    It was my intention to focus on node 1 first. Node 2 was always down.

    However, If I did start the Node 2 only, the same issue also happening here.

     

    So, when I started only node 1, these are the services list:

    [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t

    --------------------------------------------------------------------------------

    NAME           TARGET  STATE        SERVER                   STATE_DETAILS

    --------------------------------------------------------------------------------

    Local Resources

    --------------------------------------------------------------------------------

    ora.CRS.dg

                   ONLINE  ONLINE       oel6-112-rac1

    ora.DATADG.dg

                   ONLINE  ONLINE       oel6-112-rac1

    ora.FRADG.dg

                   ONLINE  ONLINE       oel6-112-rac1

    ora.LISTENER.lsnr

                   OFFLINE OFFLINE      oel6-112-rac1

    ora.asm

                   ONLINE  ONLINE       oel6-112-rac1            Started

    ora.gsd

                   OFFLINE OFFLINE      oel6-112-rac1

    ora.net1.network

                   ONLINE  ONLINE       oel6-112-rac1

    ora.ons

                   ONLINE  ONLINE       oel6-112-rac1

    --------------------------------------------------------------------------------

    Cluster Resources

    --------------------------------------------------------------------------------

    ora.LISTENER_SCAN1.lsnr

          1        OFFLINE OFFLINE

    ora.LISTENER_SCAN2.lsnr

          1        OFFLINE OFFLINE

    ora.LISTENER_SCAN3.lsnr

          1        OFFLINE OFFLINE

    ora.cvu

          1        OFFLINE OFFLINE

    ora.oc4j

          1        OFFLINE OFFLINE

    ora.oel6-112-rac1.vip

          1        OFFLINE OFFLINE

    ora.oel6-112-rac2.vip

          1        OFFLINE OFFLINE  

    ora.racdb.db

          1        OFFLINE OFFLINE                               Instance Shutdown

          2        OFFLINE OFFLINE

    ora.scan1.vip

          1        OFFLINE OFFLINE

    ora.scan2.vip

          1        OFFLINE OFFLINE

    ora.scan3.vip

          1        OFFLINE OFFLINE



     

    [grid@oel6-112-rac1 bin]$ ./crsctl status resource -t -init

    --------------------------------------------------------------------------------

    NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

    --------------------------------------------------------------------------------

    Cluster Resources

    --------------------------------------------------------------------------------

    ora.asm

          1        ONLINE  ONLINE       oel6-112-rac1            Started

    ora.cluster_interconnect.haip

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.crf

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.crsd

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.cssd

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.cssdmonitor

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.ctssd

          1        ONLINE  ONLINE       oel6-112-rac1            ACTIVE:0

    ora.diskmon

          1        OFFLINE OFFLINE

    ora.evmd

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.gipcd

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.gpnpd

          1        ONLINE  ONLINE       oel6-112-rac1

    ora.mdnsd

          1        ONLINE  ONLINE       oel6-112-rac1

     

    That's what I will get no matter how long I waited, it won't start the rest of the services.

    Starting up the services manually does not give any issue at all. it was started successfully.

     

    There should be somewhere to look for this.

    Thanks for replying Tushar.

     

    Regards,

    Adhika

  • 9. Re: Grid Infrastructure Does Not Start Cluster Resources
    Adhika W Newbie
    Currently Being Moderated

    in addition to that, I saw some thing weird in the $GRID_HOME/log/<hostname>/agent/crsd/orarootagent_root/orarootagent_root.log

     

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] ifname=eth0

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] subnetmask=255.255.255.0

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] subnetnumber=192.168.1.0

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] InterfaceName = eth0

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] HostName oel6-112-rac1-vip translated to 192.168.1.75

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] Interface Name = eth0

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] Ip Address = 192.168.1.75

    [ora.oel6-112-rac1.vip][100656896] {1:15932:2} [check] VipAgent::checkIp returned false

    [    AGFW][67098368] {1:15932:2} ora.oel6-112-rac1.vip 1 1 state changed from: UNKNOWN to: OFFLINE

    [    AGFW][67098368] {1:15932:2} Agent sending last reply for: RESOURCE_PROBE[ora.oel6-112-rac1.vip 1 1] ID 4097:110

    [    AGFW][67098368] {1:15932:2} Agent received the message: RESOURCE_DELETE[ora.oel6-112-rac1.vip 1 1] ID 4358:152

    [    AGFW][67098368] {1:15932:2} Agent sending last reply for: RESOURCE_DELETE[ora.oel6-112-rac1.vip 1 1] ID 4358:152

    [    AGFW][67098368] {1:15932:2} ora.oel6-112-rac1.vip 1 1 marked as deleted.

     

    Hope this give something to be investigated on.

     

    Thank you,

    Adhika

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points