1 2 3 Previous Next 37 Replies Latest reply: Oct 11, 2013 12:59 PM by hhhyyy RSS

    RAC nodes does not start after server reboot

    727876
      Hi everyone,

      this morning all the switches in our server room rebooted causing all the RAC servers to restart.
      After this none of them would start successfully.

      Oracle 11.2.0.1 on RHEL6

      Here are some log info:
      --------
      crsd.log
      --------
      2011-05-20 12:13:10.782: [ CSSCLNT][2146903840]clssscConnect: gipc request failed with 29 (0x16)
      2011-05-20 12:13:10.782: [ CSSCLNT][2146903840]clsssInitNative: connect failed, rc 29
      2011-05-20 12:13:10.783: [  CRSRTI][2146903840] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
      
      
      ----------------
      alertstgrac1.log
      ----------------
      [ohasd(2303)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'stgrac1'.
      2011-05-20 12:11:15.452
      [cssd(2661)]CRS-1713:CSSD daemon is started in clustered mode
      2011-05-20 12:11:15.739
      [cssd(2661)]CRS-1603:CSSD on node stgrac1 shutdown by user.
      2011-05-20 12:12:14.033
      [/u01/app/11.2.0/grid/bin/orarootagent.bin(2563)]CRS-5818:Aborted command 'start for resource: ora.diskmon 1 1' for resource 'ora.diskmon'. Details at (:CRSAGF00113:) in /u01/app/11.2.0/grid/log/stgrac1/agent/ohasd/orarootagent_root/orarootagent_root.log.
      2011-05-20 12:12:18.039
      [ohasd(2303)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.diskmon'. Details at (:CRSPE00111:) in /u01/app/11.2.0/grid/log/stgrac1/ohasd/ohasd.log
      
      
      ---------------------
      orarootagent_root.log
      ---------------------
      2011-05-20 12:12:23.162: [ora.diskmon][2684352256] [clean] execCmd ret = 0
      2011-05-20 12:12:23.162: [ora.diskmon][2684352256] [clean] DiskmonAgent::clean } nopipe
      2011-05-20 12:12:23.163: [ora.diskmon][2684352256] [clean] clsn_agent::clean }
      2011-05-20 12:12:23.163: [    AGFW][2684352256] Command: clean for resource: ora.diskmon 1 1 completed with status: SUCCESS
      2011-05-20 12:12:23.163: [    AGFW][2684352256] Executing command: check for resource: ora.diskmon 1 1
      2011-05-20 12:12:23.164: [    AGFW][3066025728] Agent sending reply for: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826
      2011-05-20 12:12:23.164: [ora.diskmon][2684352256] [check] DiskmonAgent::check {
      2011-05-20 12:12:23.164: [ora.diskmon][2684352256] [check] DiskmonAgent::connect {
      2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::connect: skgznp_connect failed with error 56815 and the timeout expired
      2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] (null) category: 56815, operation: connect, loc: skgznpcon6, OS error: 2, other:
      2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::connect } error
      2011-05-20 12:12:23.165: [ora.diskmon][2684352256] [check] DiskmonAgent::check } 2
      2011-05-20 12:12:23.165: [    AGFW][2684352256] check for resource: ora.diskmon 1 1 completed with status: PLANNED_OFFLINE
      2011-05-20 12:12:23.165: [    AGFW][3066025728] ora.diskmon 1 1 state changed from: CLEANING to: PLANNED_OFFLINE
      2011-05-20 12:12:23.166: [    AGFW][3066025728] Agent sending last reply for: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826
      
      
      ---------
      ohasd.log
      ---------
      2011-05-20 12:12:23.167: [    AGFW][2053089024] Agfw Proxy Server sending the reply to PE for message:RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:825
      2011-05-20 12:12:23.167: [   CRSPE][2042582784] Received reply to action [Clean] message ID: 825
      2011-05-20 12:12:23.168: [    AGFW][2053089024] Received the reply to the message: RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:826 from the agent /u01/app/11.2.0/grid/bin/orarootagent_root
      2011-05-20 12:12:23.168: [    AGFW][2053089024] Agfw Proxy Server sending the last reply to PE for message:RESOURCE_CLEAN[ora.diskmon 1 1] ID 4100:825
      2011-05-20 12:12:23.169: [   CRSPE][2042582784] Received reply to action [Clean] message ID: 825
      2011-05-20 12:12:23.169: [   CRSPE][2042582784] RI [ora.diskmon 1 1] new external state [OFFLINE] old value: [UNKNOWN] label = []
      2011-05-20 12:12:23.169: [   CRSPE][2042582784] CRS-2681: Clean of 'ora.diskmon' on 'stgrac1' succeeded
      
      2011-05-20 12:12:23.169: [   CRSPE][2042582784] Sequencer for [ora.diskmon 1 1] has completed with error: CRS-0215: Could not start resource 'ora.diskmon'.
      
      
      ./crsctl stat res -t -init
      --------------------------------------------------------------------------------
      NAME           TARGET  STATE        SERVER                   STATE_DETAILS
      --------------------------------------------------------------------------------
      Cluster Resources
      --------------------------------------------------------------------------------
      ora.asm
            1        ONLINE  OFFLINE
      ora.crsd
            1        ONLINE  INTERMEDIATE stgrac1
      ora.cssd
            1        ONLINE  OFFLINE
      ora.cssdmonitor
            1        ONLINE  ONLINE       stgrac1
      ora.ctssd
            1        ONLINE  OFFLINE
      ora.diskmon
            1        ONLINE  OFFLINE
      ora.evmd
            1        ONLINE  ONLINE       stgrac1
      ora.gipcd
            1        ONLINE  ONLINE       stgrac1
      ora.gpnpd
            1        ONLINE  ONLINE       stgrac1
      ora.mdnsd
            1        ONLINE  ONLINE       stgrac1
      This errors look the same for 2 different RAC clusters(2 nodes per cluster).

      Can anybody please give me some ideas on what I can check further?
        • 1. Re: RAC nodes does not start after server reboot
          Sebastian Solbach -Dba Community-Oracle
          Hi,

          if an error occured, which lead to a reboot/restart of the clusterware, the clusterware will try several times to startup the ressources.
          If this is not successfull in a timely manner, it will give up starting them. So depending on the downtime of the switches, it might not have been able to bring up the nodes successfully...

          Can you try restarting one manually (crsctl stop crs -f) followed by a crsctl start crs.

          Regards
          Sebastian
          • 2. Re: RAC nodes does not start after server reboot
            Sebastian Solbach -Dba Community-Oracle
            Hi,

            can you look/post diskmon.log an ocssd.log as well please?

            Regards
            Sebastian
            • 3. Re: RAC nodes does not start after server reboot
              727876
              Hi Sebastian,

              thanks for your reply.

              I have tried stop and starting it multiple times already and rebooted the servers manually but without any luck(still get the same errors).
              When I tried to stop crs it throws(this was when I used the command crsctl stop cluster -all):
              CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'stgrac1'
              CRS-2673: Attempting to stop 'ora.crsd' on 'stgrac1'
              CRS-4548: Unable to connect to CRSD
              CRS-2675: Stop of 'ora.crsd' on 'stgrac1' failed
              CRS-2679: Attempting to clean 'ora.crsd' on 'stgrac1'
              CRS-4548: Unable to connect to CRSD
              CRS-2678: 'ora.crsd' on 'stgrac1' has experienced an unrecoverable failure
              CRS-0267: Human intervention required to resume its availability.
              CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'stgrac1' has failed
              CRS-4687: Shutdown command has completed with error(s).
              CRS-4000: Command Stop failed, or completed with errors.
              I had to kill all the processes. I ran crsctl start crs which gives the following
              ./crsctl start crs
              CRS-4123: Oracle High Availability Services has been started.
              
              ./crsctl stat res -t -init
              --------------------------------------------------------------------------------
              NAME           TARGET  STATE        SERVER                   STATE_DETAILS
              --------------------------------------------------------------------------------
              Cluster Resources
              --------------------------------------------------------------------------------
              ora.asm
                    1        ONLINE  OFFLINE
              ora.crsd
                    1        ONLINE  OFFLINE
              ora.cssd
                    1        ONLINE  OFFLINE
              ora.cssdmonitor
                    1        ONLINE  ONLINE       stgrac1
              ora.ctssd
                    1        ONLINE  OFFLINE
              ora.diskmon
                    1        ONLINE  ONLINE       stgrac1
              ora.evmd
                    1        ONLINE  ONLINE       stgrac1
              ora.gipcd
                    1        ONLINE  ONLINE       stgrac1
              ora.gpnpd
                    1        ONLINE  ONLINE       stgrac1
              ora.mdnsd
                    1        ONLINE  ONLINE       stgrac1
              if I try to stop it:
              ./crsctl stop crs -f
              CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'stgrac1'
              CRS-2673: Attempting to stop 'ora.mdnsd' on 'stgrac1'
              CRS-2677: Stop of 'ora.mdnsd' on 'stgrac1' succeeded
              CRS-2673: Attempting to stop 'ora.gpnpd' on 'stgrac1'
              CRS-2677: Stop of 'ora.gpnpd' on 'stgrac1' succeeded
              CRS-2673: Attempting to stop 'ora.gipcd' on 'stgrac1'
              CRS-2677: Stop of 'ora.gipcd' on 'stgrac1' succeeded
              CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'stgrac1' has failed
              CRS-4687: Shutdown command has completed with error(s).
              CRS-4000: Command Stop failed, or completed with errors.
              -------------
              ohasd.log
              -------------
              2011-05-20 14:41:29.458: [   CRSPE][715097856] CRS-2677: Stop of 'ora.mdnsd' on 'stgrac1' succeeded
              
              2011-05-20 14:41:29.459: [UiServer][710895360] Container [ Name: ORDER
                      MESSAGE:
                      TextMessage[CRS-2677: Stop of 'ora.mdnsd' on 'stgrac1' succeeded]
                      MSGTYPE:
                      TextMessage[3]
                      OBJID:
                      TextMessage[ora.mdnsd]
                      WAIT:
                      TextMessage[0]
              ]
              2011-05-20 14:41:33.593: [  CRSCCL][954181376]RDE-00051: provider "Oracle Apple DNS-SD Provider" error.
                RDE-02001: failed to connect to the mDNS responder.
              2011-05-20 14:41:33.593: [  CRSCCL][954181376]RDE-00051: provider "Oracle Apple DNS-SD Provider" error.
                RDE-02001: failed to connect to the mDNS responder.
                  CLSDNSSD-00001: unknown error has occurred.
              2011-05-20 14:41:33.594: [  CRSCCL][954181376]RDE-00051: provider "Oracle Apple DNS-SD Provider" error.
                RDE-02001: failed to connect to the mDNS responder.
                  CLSDNSSD-00001: unknown error has occurred.
                    RDE-00051: provider "Oracle Apple DNS-SD Provider" error.
                      CLSDNSSD-00001: unknown error has occurred.
              2011-05-20 14:41:33.601: [  CRSCCL][954181376]Restarting RD find queries.
              .
              .
              .
              .
              .
              2011-05-20 14:45:48.974: [    AGFW][725604096] Agfw Proxy Server received the message: CMD_COMPLETED[Proxy] ID 20482:880
              2011-05-20 14:45:48.974: [    AGFW][725604096] Agfw Proxy Server replying to the message: CMD_COMPLETED[Proxy] ID 20482:880
              2011-05-20 14:45:48.974: [   CRSPE][715097856] Shutdown cmd failed: Server Shutdown {stgrac1} : pass=2 : 0x7fe7fc097be0
              2011-05-20 14:45:48.974: [   CRSPE][715097856] Server [stgrac1] has changed state from [LEAVING] to [ONLINE]
              2011-05-20 14:45:48.975: [  CRSOCR][723502848] Multi Write Batch processing...
              2011-05-20 14:45:48.975: [   CRSPE][715097856] UI Command [Server Shutdown {stgrac1} : pass=2 : 0x7fe7fc097be0] is replying to sender.
              2011-05-20 14:45:48.975: [UiServer][710895360] Container [ Name: UI_DATA
                      RESULT:
                      TextMessage[223]
              ]
              2011-05-20 14:45:48.975: [UiServer][710895360] Done for ctx=0x7fe7f4007c60
              2011-05-20 14:45:48.977: [UiServer][708794112] Closed: remote end failed/disc.
              2011-05-20 14:45:49.098: [ CRSCOMM][1413453568][FFAIL] Couldnt clscreceive message, no message: 11
              2011-05-20 14:45:49.098: [ CRSCOMM][1413453568] Client disconnected.
              2011-05-20 14:45:49.098: [ CRSCOMM][1413453568][FFAIL] Listener got clsc error 11 for memNum. 9
              2011-05-20 14:45:49.098: [ CRSCOMM][1413453568] IPC listener connection to member 9 has been removed
              2011-05-20 14:45:49.098: [CLSFRAME][1413453568] Removing IPC Member:{Relative|Node:0|Process:9|Type:3}
              2011-05-20 14:45:49.098: [CLSFRAME][1413453568] Disconnected from AGENT process: {Relative|Node:0|Process:9|Type:3}
              2011-05-20 14:45:49.099: [    AGFW][725604096] Agfw Proxy Server received process disconnected notification, count=1
              2011-05-20 14:45:49.099: [    AGFW][725604096] /u01/app/11.2.0/grid/bin/cssdagent_root disconnected.
              2011-05-20 14:45:49.099: [    AGFW][725604096] Agent /u01/app/11.2.0/grid/bin/cssdagent_root[10276] stopped!
              2011-05-20 14:45:49.099: [ CRSCOMM][725604096] removeConnection: Member 9 does not exist.
              2011-05-20 14:45:49.099: [   CRSPE][715097856] Disconnected from server:
              2011-05-20 14:45:49.273: [  CRSOCR][723502848] Multi Write Batch done.
              Any other ideas?

              Thanks.

              Edited by: user10506095 on May 20, 2011 2:47 PM
              • 4. Re: RAC nodes does not start after server reboot
                Sebastian Solbach -Dba Community-Oracle
                Hi,

                mdnsd is the tool to provision the GPNP profile to other nodes.
                This happens via. Multicasting over the public interface.

                Can it be that after your switches rebooted, something changed on the public interface?
                This could explain this problem...

                Regards
                Sebastian
                • 5. Re: RAC nodes does not start after server reboot
                  727876
                  Hi Sebastian,

                  I have checked all the interfaces and it looks as if everything is still intact.
                  All the IP's are still the same and running on the same devices. I am also able to ping all the hostnames and priv ip's.
                  I also asked the SA to check everything network related and he said everything is fine.

                  What else can I check on the interfaces?

                  Thanks

                  Edited by: user10506095 on May 20, 2011 3:01 PM
                  The weird thing is that we get exactly the same errors on all nodes so it must be something external causing this as this is not specific to one node
                  • 6. Re: RAC nodes does not start after server reboot
                    Ganadeva
                    Hi,

                    What is o/p of check on the cluster components ?

                    OCR
                    ocrcheck

                    nodes in the cluster
                    olsnodes -n -p -i

                    voting disk information
                    crsctl query css votedisk

                    and cluster health
                    crsctl check cluster

                    Regards,
                    Ganadeva
                    • 7. Re: RAC nodes does not start after server reboot
                      727876
                      Hi Ganadeva,

                      thanks for your reply.

                      The output is as follow:
                      ./ocrcheck
                      PROT-602: Failed to retrieve data from the cluster registry
                      PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfolclcpi1, dep=210, loc=kgfokge
                      AMDU-00210: No disks found in diskgroup CRS
                      AMDU-00210: No disks found in diskgroup CRS
                      
                      ./olsnodes -n -i
                      PRCO-19: Failure retrieving list of nodes in the cluster
                      PRCO-2: Unable to communicate with the clusterware
                      
                      ./crsctl query css votedisk
                      Unable to communicate with the Cluster Synchronization Services daemon.
                      
                      ./crsctl check cluster
                      CRS-4535: Cannot communicate with Cluster Ready Services
                      CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
                      CRS-4533: Event Manager is online
                      Just keep in mind that my OCR/Votedisk is in ASM.

                      Regards.

                      Edited by: user10506095 on May 20, 2011 3:12 PM
                      changed the ./ocrcheck output
                      • 8. Re: RAC nodes does not start after server reboot
                        Ganadeva
                        Hi,

                        Can you share a log of the checks that the SA has carried out such as ifconfig -a, etc.? Anything visible in the OS log /var/messages ?

                        Are you able to do a ssh within the nodes of the cluster?

                        Also check if the subnets are same for the nodes before and after the server reboot.

                        Regards,
                        Ganadeva
                        • 9. Re: RAC nodes does not start after server reboot
                          727876
                          Hi Ganadeva,

                          I checked all the ip's and subnets are still correct.
                          There are no errors in the /var/log/messages.
                          Node reachability is not a problem, can ping and ssh.

                          The switches restarted, according to one of the SA, because of a power dip but it is on a UPS so I don't see how that is possible???
                          He said that he will reset them to factory defaults(makes you wonder) and then I will try and see if that made a difference, will keep you posted.




                          After the switches has been reset I restarted the server and still get the same error.
                          The alert log shows:
                          [/u01/app/11.2.0/grid/bin/orarootagent.bin(8298)]CRS-5818:Aborted command 'start for resource: ora.diskmon 1 1' for resource 'ora.diskmon'. Details at (:CRSAGF00113:) in /u01/app/11.2.0/grid/log/orarac1/agent/ohasd/orarootagent_root/orarootagent_root.log.
                          2011-05-20 14:15:01.081
                          [ohasd(2430)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.diskmon'. Details at (:CRSPE00111:) in /u01/app/11.2.0/grid/log/orarac1/ohasd/ohasd.log.
                          2011-05-20 14:20:19.317
                          [ohasd(2430)]CRS-2758:Resource 'ora.crsd' is in an unknown state.
                          2011-05-20 14:23:57.772
                          [/u01/app/11.2.0/grid/bin/cssdagent(9280)]CRS-5818:Aborted command 'start for resource: ora.cssd 1 1' for resource 'ora.cssd'. Details at (:CRSAGF00113:) in /u01/app/11.2.0/grid/log/orarac1/agent/ohasd/oracssdagent_root/oracssdagent_root.log.
                          2011-05-20 14:24:03.502
                          [ohasd(2430)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.cssd'. Details at (:CRSPE00111:) in /u01/app/11.2.0/grid/log/orarac1/ohasd/ohasd.log.
                          2011-05-20 15:00:26.063
                          [ohasd(2430)]CRS-2765:Resource 'ora.crsd' has failed on server 'orarac1'.
                          2011-05-20 15:30:38.296
                          [ohasd(2430)]CRS-2765:Resource 'ora.crsd' has failed on server 'orarac1'.
                          2011-05-20 16:00:49.507
                          [ohasd(2430)]CRS-2765:Resource 'ora.crsd' has failed on server 'orarac1'.
                          2011-05-20 16:21:10.329
                          [/u01/app/11.2.0/grid/bin/cssdmonitor(2741)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/cssdmonitor_root' disconnected from server. Details at (:CRSAGF00117:) in /u01/app/11.2.0/grid/log/orarac1/agent/ohasd/oracssdmonitor_root/oracssdmonitor_root.log.
                          2011-05-20 16:21:10.329
                          [/u01/app/11.2.0/grid/bin/orarootagent.bin(15959)]CRS-5822:Agent '/u01/app/11.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) in /u01/app/11.2.0/grid/log/orarac1/agent/ohasd/orarootagent_root/orarootagent_root.log.
                          2011-05-20 16:22:45.051
                          [ohasd(2383)]CRS-2112:The OLR service started on node orarac1.
                          2011-05-20 16:22:45.848
                          [ohasd(2383)]CRS-8017:location: /etc/oracle/lastgasp has 18 reboot advisory log files, 0 were announced and 0 errors occurred
                          2011-05-20 16:22:49.423
                          [ohasd(2383)]CRS-2772:Server 'orarac1' has been assigned to pool 'Free'.
                          2011-05-20 16:22:52.344
                          [ohasd(2383)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
                          2011-05-20 16:22:54.294
                          [cssd(2618)]CRS-1713:CSSD daemon is started in clustered mode
                          2011-05-20 16:22:58.800
                          [cssd(2618)]CRS-1603:CSSD on node orarac1 shutdown by user.
                          2011-05-20 16:22:58.909
                          [ohasd(2383)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'orarac1'.
                          2011-05-20 16:23:10.869
                          [cssd(2786)]CRS-1713:CSSD daemon is started in clustered mode
                          2011-05-20 16:23:11.091
                          [cssd(2786)]CRS-1603:CSSD on node orarac1 shutdown by user.
                          2011-05-20 16:31:35.895
                          [ohasd(2383)]CRS-2765:Resource 'ora.diskmon' has failed on server 'orarac1'.
                          2011-05-20 16:31:35.907
                          [ohasd(2383)]CRS-2767:Target resource 'ora.diskmon' is offline, will not recover.
                          the crsd.log shows:
                          .
                          .
                          .
                          2011-05-20 16:36:59.747: [ CSSCLNT][1700964128]clssscConnect: gipc request failed with 29 (0x16)
                          2011-05-20 16:36:59.747: [ CSSCLNT][1700964128]clsssInitNative: connect failed, rc 29
                          2011-05-20 16:36:59.748: [  CRSRTI][1700964128] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
                          
                          2011-05-20 16:37:00.751: [ CSSCLNT][1700964128]clssscConnect: gipc request failed with 29 (0x16)
                          2011-05-20 16:37:00.751: [ CSSCLNT][1700964128]clsssInitNative: connect failed, rc 29
                          2011-05-20 16:37:00.752: [  CRSRTI][1700964128] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
                          
                          2011-05-20 16:37:01.755: [ CSSCLNT][1700964128]clssscConnect: gipc request failed with 29 (0x16)
                          2011-05-20 16:37:01.755: [ CSSCLNT][1700964128]clsssInitNative: connect failed, rc 29
                          2011-05-20 16:37:01.756: [  CRSRTI][1700964128] CSS is not ready. Received status 3 from CSS. Waiting for good status ..
                          Any suggestions?

                          Edited by: user10506095 on May 20, 2011 4:34 PM
                          • 10. Re: RAC nodes does not start after server reboot
                            727876
                            diskmon.log
                            2011-05-20 15:11:17.095: [ DISKMON][11956] dskm main: starting up
                            2011-05-20 15:11:17.096: [ DISKMON][11956:2905310976] dskm_rac_thrd_main: running
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2905310976] dskm_clss_ini1: calling clssscbinit
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2922919680] dskm_rac_thrd_creat2: got the post from the css event handling thread
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2905310976] dskm_clss_ini2: calling clsssinit
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2894821120] dskm_oss_thrd_main: running
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2922919680] dskm_oss_thrd_creat2: got the post from the oss check status thread
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2922919680] dskm main: startup complete
                            2011-05-20 15:11:17.097: [ DISKMON][11956:2922919680]            listening on -> /var/tmp/.oracle/master_diskmon
                            2011-05-20 15:11:17.108: [ DISKMON][11956:2905310976] dskm_clss_ini5: successful clsssinit(), clssvers 2.1
                            2011-05-20 15:11:17.108: [ DISKMON][11956:2905310976] dskm_clss_ini6: calling clssnsqlnum
                            2011-05-20 15:11:17.254: [ CSSCLNT]clsssRecvMsg: got a disconnect from the server while waiting for message type 24
                            2011-05-20 15:11:17.254: [ CSSCLNT]clssnsqlnum: RPC failed rc 3
                            
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] dskm_clss_ini7: clssnsqclnum failed, clssret 3
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] dskm_rac_ini1: dskm_clss_ini failed with error 56830 ... exiting
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] dskm_nfy_kgzf1: notified thread kgzf disabled
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] dskm_rac_thrd_main3: dskm_rac_ini failed with error 56830 ... exiting
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] SHUTDOWN ABORT due to error 56830
                            2011-05-20 15:11:17.255: [ DISKMON][11956:2905310976] dskm_rac_thrd_main: exiting
                            2011-05-20 15:11:19.600: [ DISKMON][11956:2922919680] dskm_cleanup_thrds: cleaning up the rac event handling thread tid 2905310976
                            2011-05-20 15:11:19.600: [ DISKMON][11956:2894821120] dskm_oss_thrd_main2: posted
                            2011-05-20 15:11:19.600: [ DISKMON][11956:2894821120] dskm_oss_thrd_main: exiting
                            [ DISKMON][11956]
                                    Process 11956 exiting on 2011-05-20 at 15:11:20.103
                            ocssd.log
                            2011-05-20 15:11:17.108: [    CSSD][3221223168]clssgmEvtInformation: reqtype (11) cmProc (0x7f9ab801ccf0) client ((nil))
                            2011-05-20 15:11:17.108: [    CSSD][3221223168]clssgmEvtInformation: reqtype (11) req (0x7f9ab801d0f0)
                            2011-05-20 15:11:17.108: [    CSSD][3221223168]clssnmQueueNotification: type (11) 0x7f9ab801d0f0
                            2011-05-20 15:11:17.118: [    GPnP][3477141248]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/stgrac1/wallets/peer/cwallet.sso'
                            2011-05-20 15:11:17.119: [    GPnP][3477141248]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
                            2011-05-20 15:11:17.119: [    GPnP][3477141248]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
                            2011-05-20 15:11:17.119: [    GPnP][3477141248]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=11926, tl=3, f=3
                            2011-05-20 15:11:17.135: [GIPCXCPT][3477141248]gipcShutdownF: skipping shutdown, count 3, from [ clsinet.c : 1732], ret gipcretSuccess (0)
                            2011-05-20 15:11:17.136: [GIPCXCPT][3477141248]gipcShutdownF: skipping shutdown, count 2, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
                            2011-05-20 15:11:17.137: [GIPCGMOD][3477141248]gipcmodGipcPassInitializeNetwork: using host information 10.10.1.20
                            2011-05-20 15:11:17.137: [    CSSD][3477141248]clssnmOpenGIPCEndp: listening on gipc://stgrac1:nm_stgrac-cluster#10.10.1.20#55536
                            2011-05-20 15:11:17.137: [    CSSD][3477141248]clssnmInitNMInfo: Initializing uniqueness 0
                            2011-05-20 15:11:17.137: [    CSSD][3477141248]clssnmReadDiscoveryProfile: voting file discovery string()
                            2011-05-20 15:11:17.137: [    CSSD][3477141248]clssnkInit: NK generic layer initializing.
                            2011-05-20 15:11:17.139: [    CSSD][3477141248]clssscGetParameterOLR: OLR fetch for parameter GIPC NM trclvl (12) failed with rc 21
                            2011-05-20 15:11:17.150: [   SKGFD][3210733312]NOTE: No asm libraries found in the system
                            
                            2011-05-20 15:11:17.150: [    CLSF][3210733312]Allocated CLSF context
                            2011-05-20 15:11:17.150: [    CSSD][3210733312]clssnmvDDiscThread: using discovery string  for initial discovery
                            2011-05-20 15:11:17.150: [   SKGFD][3210733312]Discovery with str::
                            
                            2011-05-20 15:11:17.150: [   SKGFD][3210733312]UFS discovery with ::
                            
                            2011-05-20 15:11:17.151: [   SKGFD][3210733312]Fetching UFS disk :/dev/raw/rawctl:
                            
                            2011-05-20 15:11:17.151: [    CLSF][3210733312]Ignoring 0-byte file /dev/raw/rawctl
                            
                            2011-05-20 15:11:17.151: [   SKGFD][3210733312]OSS discovery with ::
                            
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]clssnmvDiskVerify: Successful discovery of 0 disks
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]clssnmvFindInitialConfigs: No voting files found
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]###################################
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread
                            2011-05-20 15:11:17.153: [    CSSD][3210733312]###################################
                            2011-05-20 15:11:17.153: [    CSSD][3210733312]
                            
                            ----- Call Stack Trace -----
                            2011-05-20 15:11:17.153: [    CSSD][3221223168]clssgmClientShutdown: total iocapables 0
                            2011-05-20 15:11:17.153: [    CSSD][3221223168]clssgmClientShutdown: graceful shutdown completed.
                            2011-05-20 15:11:17.153: [    CSSD][3210733312]calling              call     entry                argument values in hex
                            2011-05-20 15:11:17.153: [    CSSD][3210733312]location             type     point                (? means dubious value)
                            2011-05-20 15:11:17.153: [    CSSD][3210733312]-------------------- -------- -------------------- ----------------------------
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]clssscExit()+594     call     kgdsdst()            000000000 ? 000000000 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]                                                   7F9ABF5F8CD8 ? 000000001 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]                                                   7F9ABF5FD1D8 ? 000000000 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]s0clsssc_sighandler  call     clssscExit()         00181A8E0 ? 000000001 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]()+611                                             7F9ABF5F8CD8 ? 000000001 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]                                                   7F9ABF5FD1D8 ? 000000000 ?
                            2011-05-20 15:11:17.164: [    CSSD][3210733312]__restore_rt()       call     s0clsssc_sighandler  00000000B ? 000000001 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]                              ()                   7F9ABF5F8CD8 ? 000000001 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]                                                   7F9ABF5FD1D8 ? 000000000 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]clssnmCompleteInitV  signal   __restore_rt()       000000001 ? 000000000 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]FDiscovery()+185                                   001643F28 ? 3E0C00E1B5 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]clssnmvDDiscThread(  call     clssnmCompleteInitV  00181A8E0 ? 001643F10 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312])+2062                        FDiscovery()         0017E6E80 ? 000000001 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]clssscthrdmain()+20  call     clssnmvDDiscThread(  00181A8E0 ? 001643F10 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]6                             )                    0017E6E80 ? 000000001 ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.165: [    CSSD][3210733312]start_thread()+209   call     clssscthrdmain()     00181A8E0 ? 001643F10 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   001643F10 ? 000000001 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]clone()+109          call     start_thread()       7F9ABF5FE700 ? 001643F10 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   001643F10 ? 000000001 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]0000000000000000     call     clone()              7F9ABF5FE700 ? 001643F10 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   001643F10 ? 000000001 ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]                                                   0016453B0 ? 000002EBA ?
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]--------------------- Binary Stack Dump ---------------------
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]========== FRAME [1] (clssscExit()+594 -> kgdsdst()) ==========
                            2011-05-20 15:11:17.166: [    CSSD][3210733312]defined by frame pointers 0x7f9abf5fd610  and 0x7f9abf5fd540
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]CALL TYPE: call   ERROR SIGNALED: no   CALLER: clssscExit
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]RDI 0000000000000000 RSI 0000000000000000 RDX 00007F9ABF5F8CD8
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]RCX 0000000000000001 R8 00007F9ABF5FD1D8 R9 0000000000000000
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]RAX 0000000000000000 RBX 000000000181A910 RBP 00007F9ABF5FD610
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]R10 BF5FD55000000000 R11 0000000000000000 R12 00007FFFAD643E80
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]R13 00007F9ABF5FE9C0 R14 0000000000000004 R15 0000000000000007
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]RSP 00007F9ABF5FD550 RIP 0000000000447102
                            2011-05-20 15:11:17.167: [    CSSD][3210733312]
                            Dump of memory from 0x7f9abf5fd540 to 0x7f9abf5fd610
                            .
                            .
                            .
                            in the ocssd.log it says
                            2011-05-20 15:11:17.150: [   SKGFD][3210733312]NOTE: No asm libraries found in the system
                            .
                            .
                            .
                            2011-05-20 15:11:17.152: [    CSSD][3210733312]clssnmvDiskVerify: Successful discovery of 0 disks
                            which is strange because asm libs are working fine:
                            oracleasm status
                            Checking if ASM is loaded: yes
                            Checking if /dev/oracleasm is mounted: yes
                            
                            oracleasm scandisks
                            Reloading disk partitions: done
                            Cleaning any stale ASM disks...
                            Scanning system for ASM disks...
                            
                            oracleasm listdisks
                            CRSVOL01
                            CRSVOL02
                            CRSVOL03
                            DATAVOL01
                            FRAVOL01
                            Regards
                            • 11. Re: RAC nodes does not start after server reboot
                              Ganadeva
                              Hi,

                              Check the Oracle Support note 1050164.1 and inform us if it helps.

                              Regards,
                              Ganadeva
                              • 12. Re: RAC nodes does not start after server reboot
                                727876
                                Hi Ganadeva,

                                I unfortunately don't have access to MOS.

                                Thanks.

                                I got a copy of the doc here:
                                http://abiliusta.blogspot.com/2010/03/ora-15186-asmlib-mesg-operation-not.html
                                This look to be specific to RAW devices.

                                Edited by: user10506095 on May 20, 2011 6:15 PM
                                • 13. Re: RAC nodes does not start after server reboot
                                  Sebastian Solbach -Dba Community-Oracle
                                  Hi,

                                  to get help from Oracle will be difficult anyway - since Oracle DB is not yet certified on RHEL 6 (let alone Oracle Grid infrastructure).

                                  Regards
                                  Sebastian
                                  • 14. Re: RAC nodes does not start after server reboot
                                    727876
                                    Hi,

                                    I have gone through note "How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]", "NOTE:1054902.1 - How to Validate Network and Name Resolution Setup for the Clusterware and RAC" and "What to Do if 11gR2 Clusterware is Unhealthy [ID 1068835.1]" without any success.

                                    The cssd process/agent won't start no matter what.
                                    ./crsctl stat res -t -init
                                    --------------------------------------------------------------------------------
                                    NAME           TARGET  STATE        SERVER                   STATE_DETAILS
                                    --------------------------------------------------------------------------------
                                    Cluster Resources
                                    --------------------------------------------------------------------------------
                                    ora.asm
                                          1        ONLINE  OFFLINE
                                    ora.crsd
                                          1        ONLINE  INTERMEDIATE orarac1
                                    ora.cssd
                                          1        ONLINE  OFFLINE
                                    ora.cssdmonitor
                                          1        ONLINE  ONLINE       orarac1
                                    ora.ctssd
                                          1        ONLINE  OFFLINE
                                    ora.diskmon
                                          1        ONLINE  OFFLINE
                                    ora.evmd
                                          1        ONLINE  ONLINE       orarac1
                                    ora.gipcd
                                          1        ONLINE  ONLINE       orarac1
                                    ora.gpnpd
                                          1        ONLINE  ONLINE       orarac1
                                    ora.mdnsd
                                          1        ONLINE  ONLINE       orarac1
                                    
                                    
                                    ./crsctl start res ora.cssd -init
                                    CRS-2672: Attempting to start 'ora.cssd' on 'orarac1'
                                    CRS-2672: Attempting to start 'ora.diskmon' on 'orarac1'
                                    CRS-2674: Start of 'ora.diskmon' on 'orarac1' failed
                                    CRS-2679: Attempting to clean 'ora.diskmon' on 'orarac1'
                                    CRS-2681: Clean of 'ora.diskmon' on 'orarac1' succeeded
                                    CRS-2674: Start of 'ora.cssd' on 'orarac1' failed
                                    CRS-2679: Attempting to clean 'ora.cssd' on 'orarac1'
                                    CRS-2681: Clean of 'ora.cssd' on 'orarac1' succeeded
                                    CRS-4000: Command Start failed, or completed with errors.
                                    Here are the logs for diskmon and cssd.
                                    ===========
                                    diskmon.log
                                    ===========
                                    2011-05-23 11:18:07.527: [ DISKMON][4899] dskm main: starting up
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2448107264] dskm_rac_thrd_main: running
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2465720064] dskm_rac_thrd_creat2: got the post from the css event handling thread
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2448107264] dskm_clss_ini1: calling clssscbinit
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2448107264] dskm_clss_ini2: calling clsssinit
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2437617408] dskm_oss_thrd_main: running
                                    2011-05-23 11:18:07.528: [ DISKMON][4899:2465720064] dskm_oss_thrd_creat2: got the post from the oss check status thread
                                    2011-05-23 11:18:07.529: [ DISKMON][4899:2465720064] dskm main: startup complete
                                    2011-05-23 11:18:07.529: [ DISKMON][4899:2465720064]            listening on -> /var/tmp/.oracle/master_diskmon
                                    2011-05-23 11:18:07.533: [ CSSCLNT]clssscConnect: gipc request failed with 29 (0x16)
                                    2011-05-23 11:18:07.533: [ CSSCLNT]clsssInitNative: connect failed, rc 29
                                    2011-05-23 11:18:08.034: [ DISKMON][4899:2448107264] dskm_clss_ini2: calling clsssinit
                                    2011-05-23 11:18:08.037: [ CSSCLNT]clssscConnect: gipc request failed with 29 (0x16)
                                    2011-05-23 11:18:08.037: [ CSSCLNT]clsssInitNative: connect failed, rc 29
                                    2011-05-23 11:18:08.538: [ DISKMON][4899:2448107264] dskm_clss_ini2: calling clsssinit
                                    2011-05-23 11:18:08.540: [ CSSCLNT]clssscConnect: gipc request failed with 29 (0x16)
                                    2011-05-23 11:18:08.541: [ CSSCLNT]clsssInitNative: connect failed, rc 29
                                    2011-05-23 11:18:09.041: [ DISKMON][4899:2448107264] dskm_clss_ini2: calling clsssinit
                                    2011-05-23 11:18:09.048: [ DISKMON][4899:2448107264] dskm_clss_ini5: successful clsssinit(), clssvers 2.1
                                    2011-05-23 11:18:09.048: [ DISKMON][4899:2448107264] dskm_clss_ini6: calling clssnsqlnum
                                    2011-05-23 11:18:09.149: [ CSSCLNT]clsssRecvMsg: got a disconnect from the server while waiting for message type 24
                                    2011-05-23 11:18:09.149: [ CSSCLNT]clssnsqlnum: RPC failed rc 3
                                    
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] dskm_clss_ini7: clssnsqclnum failed, clssret 3
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] dskm_rac_ini1: dskm_clss_ini failed with error 56830 ... exiting
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] dskm_nfy_kgzf1: notified thread kgzf disabled
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] dskm_rac_thrd_main3: dskm_rac_ini failed with error 56830 ... exiting
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] SHUTDOWN ABORT due to error 56830
                                    2011-05-23 11:18:09.149: [ DISKMON][4899:2448107264] dskm_rac_thrd_main: exiting
                                    2011-05-23 11:18:10.030: [ DISKMON][4899:2465720064] dskm_cleanup_thrds: cleaning up the rac event handling thread tid 2448107264
                                    2011-05-23 11:18:10.030: [ DISKMON][4899:2437617408] dskm_oss_thrd_main2: posted
                                    2011-05-23 11:18:10.030: [ DISKMON][4899:2437617408] dskm_oss_thrd_main: exiting
                                    [ DISKMON][4899]
                                            Process 4899 exiting on 2011-05-23 at 11:18:10.532
                                    
                                    
                                    
                                    
                                    =========
                                    ocssd.log
                                    =========
                                    2011-05-23 11:18:08.820: [    CSSD][3883386624]clssscmain: Starting CSS daemon, version 11.2.0.1.0, in (clustered) mode with uniqueness value 1306135088
                                    2011-05-23 11:18:08.821: [    CSSD][3883386624]clssscmain: Environment is production
                                    2011-05-23 11:18:08.821: [    CSSD][3883386624]clssscmain: Core file size limit extended
                                    2011-05-23 11:18:08.833: [    CSSD][3883386624]clssscGetParameterOLR: OLR fetch for parameter logsize (8) failed with rc 21
                                    2011-05-23 11:18:08.833: [    CSSD][3883386624]clssscSetPrivEnv: IPMI device not installed on this node
                                    2011-05-23 11:18:08.834: [    CSSD][3883386624]clssscGetParameterOLR: OLR fetch for parameter priority (15) failed with rc 21
                                    2011-05-23 11:18:08.863: [    CSSD][3883386624]clssscExtendLimits: The current soft limit for file descriptors is 65536, hard limit is 65536
                                    2011-05-23 11:18:08.863: [    CSSD][3883386624]clssscExtendLimits: The current soft limit for locked memory is 4294967295, hard limit is 4294967295
                                    2011-05-23 11:18:08.864: [    CSSD][3883386624]clssscmain: Running as user grid
                                    2011-05-23 11:18:08.874: [    CSSD][3883386624]clssscGetParameterOLR: OLR fetch for parameter auth rep (9) failed with rc 21
                                    2011-05-23 11:18:08.874: [    CSSD][3883386624]clssscGetParameterOLR: OLR fetch for parameter diagwait (14) failed with rc 21
                                    [  clsdmt][3853297408]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=orarac1DBG_CSSD))
                                    2011-05-23 11:18:08.882: [  clsdmt][3853297408]PID for the Process [4918], connkey 4
                                    2011-05-23 11:18:08.884: [    CSSD][3883386624]clssscmain: initgminfo done
                                    2011-05-23 11:18:08.891: [    CSSD][3623876352]clssgmclientlsnr: Spawned
                                    2011-05-23 11:18:08.892: [    CSSD][3623876352]clssgmEvtInformation: reqtype (13) cmProc ((nil)) client ((nil))
                                    2011-05-23 11:18:08.892: [    CSSD][3623876352]clssgmEvtInformation: reqtype (13) req (0x7f7ed0000920)
                                    2011-05-23 11:18:08.892: [    CSSD][3623876352]clssnmQueueNotification: type (13) 0x7f7ed0000920
                                    2011-05-23 11:18:08.893: [    CSSD][3623876352]clssscGetParameterOLR: OLR fetch for parameter GIPC client trclvl (13) failed with rc 21
                                    2011-05-23 11:18:08.895: [    CSSD][3623876352]clssgmclientlsnr: listening on clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_orarac1_)(GIPCID=01490991-00000000-4918))
                                    2011-05-23 11:18:08.896: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:404] gpnp tracelevel 3, component tracelevel 0
                                    2011-05-23 11:18:08.896: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:534] '/u01/app/11.2.0/grid' in effect as GPnP home base.
                                    2011-05-23 11:18:08.904: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3866] Init gpnp local security key providers (2) fatal if both fail
                                    2011-05-23 11:18:08.904: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3869] Init gpnp local security key proveders 1 of 2: file wallet (LSKP-FSW)
                                    2011-05-23 11:18:08.905: [    GPnP][3883386624]clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:398] Using FS Wallet Location : /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    
                                    2011-05-23 11:18:08.905: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3891] Init gpnp local security key provider 1 of 2: file wallet (LSKP-FSW) OK
                                    2011-05-23 11:18:08.905: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3897] Init gpnp local security key proveders 2 of 2: OLR wallet (LSKP-CLSW-OLR)
                                    [   CLWAL][3883386624]clsw_Initialize: OLR initlevel [70000]
                                    2011-05-23 11:18:08.908: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3919] Init gpnp local security key provider 2 of 2: OLR wallet (LSKP-CLSW-OLR) OK
                                    2011-05-23 11:18:08.908: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;7. (2 providers - fatal if all fail)
                                    2011-05-23 11:18:08.908: [    GPnP][3883386624]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    2011-05-23 11:18:08.925: [    GPnP][3883386624]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/cwallet.sso'
                                    2011-05-23 11:18:08.925: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
                                    2011-05-23 11:18:08.925: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
                                    2011-05-23 11:18:08.931: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;4. (2 providers - fatal if all fail)
                                    2011-05-23 11:18:08.931: [    GPnP][3883386624]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    2011-05-23 11:18:08.933: [    CSSD][3623876352]clssscSelect: cookie accept request 0x252f8a8
                                    2011-05-23 11:18:08.933: [    CSSD][3623876352]clssgmAllocProc: (0x7f7ed0028ee0) allocated
                                    2011-05-23 11:18:08.933: [    CSSD][3623876352]clssgmClientConnectMsg: properties of cmProc 0x7f7ed0028ee0 - 1,2,3,4
                                    2011-05-23 11:18:08.933: [    CSSD][3623876352]clssgmClientConnectMsg: Connect from con(0x69) proc(0x7f7ed0028ee0) pid(2650) version 11:2:1:4, properties: 1,2,3,4
                                    2011-05-23 11:18:08.933: [    CSSD][3623876352]clssgmClientConnectMsg: msg flags 0x0000
                                    2011-05-23 11:18:08.945: [    GPnP][3883386624]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/cwallet.sso'
                                    2011-05-23 11:18:08.946: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
                                    2011-05-23 11:18:08.946: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
                                    2011-05-23 11:18:08.946: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=4918, tl=3, f=3
                                    2011-05-23 11:18:08.946: [    GPnP][3883386624]clsgpnp_getDaemonLocalName: [at clsgpnp0.c:2576] Result: (0) CLSGPNP_OK. Local gpnp connect: 'ipc://GPNPD_orarac1'
                                    2011-05-23 11:18:08.947: [    GPnP][3883386624]clsgpnpm_exchange: [at clsgpnpm.c:1175] Calling "ipc://GPNPD_orarac1", try 1 of 500...
                                    2011-05-23 11:18:08.948: [    CSSD][3623876352]clssgmEvtInformation: reqtype (11) cmProc (0x7f7ed0028ee0) client ((nil))
                                    2011-05-23 11:18:08.948: [    CSSD][3623876352]clssgmEvtInformation: reqtype (11) req (0x7f7ed0032c10)
                                    2011-05-23 11:18:08.948: [    CSSD][3623876352]clssnmQueueNotification: type (11) 0x7f7ed0032c10
                                    2011-05-23 11:18:08.963: [    GPnP][3883386624]clsgpnp_profileVerifyForCall: [at clsgpnp.c:1867] Result: (87) CLSGPNP_SIG_VALPEER. Profile verified.  prf=0x273f810
                                    2011-05-23 11:18:08.963: [    GPnP][3883386624]clsgpnp_profileGetSequenceRef: [at clsgpnp.c:841] Result: (0) CLSGPNP_OK. seq of p=0x273f810 is '7'=7
                                    2011-05-23 11:18:08.963: [    GPnP][3883386624]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2186] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote "ipc://GPNPD_orarac1" disco ""
                                    2011-05-23 11:18:08.963: [    GPnP][3883386624]clsgpnp_getProfileEx: [at clsgpnp.c:540] Result: (0) CLSGPNP_OK. got profile 0x273f810
                                    2011-05-23 11:18:08.963: [    CSSD][3883386624]clssscGetParameterProfile: profile fetch failed for parameter ocrid (4) with return code 5
                                    2011-05-23 11:18:08.963: [    CSSD][3883386624]clssscmain: OCRID is 0
                                    2011-05-23 11:18:08.963: [    CSSD][3883386624]clssscmain: Cluster GUID is 6f027a4ae656efafbf3dc309bd5d9beb
                                    2011-05-23 11:18:08.963: [    CSSD][3883386624]clssnmNotifyReq: type (12)
                                    2011-05-23 11:18:08.965: [    CSSD][3883386624]clssscmain: last used node number 1
                                    2011-05-23 11:18:08.965: [    CSSD][3883386624]clssnmOpenGIPCEndp: opening cluster listener on gipc://orarac1:nm_orarac-cluster
                                    2011-05-23 11:18:08.965: [GIPCGMOD][3883386624]gipcmodGipcPassInitializeNetwork: Initializing passthrough GIPC
                                    2011-05-23 11:18:08.968: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:404] gpnp tracelevel 3, component tracelevel 0
                                    2011-05-23 11:18:08.968: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:534] '/u01/app/11.2.0/grid' in effect as GPnP home base.
                                    2011-05-23 11:18:08.975: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3866] Init gpnp local security key providers (2) fatal if both fail
                                    2011-05-23 11:18:08.975: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3869] Init gpnp local security key proveders 1 of 2: file wallet (LSKP-FSW)
                                    2011-05-23 11:18:08.975: [    GPnP][3883386624]clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:398] Using FS Wallet Location : /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    
                                    2011-05-23 11:18:08.975: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3891] Init gpnp local security key provider 1 of 2: file wallet (LSKP-FSW) OK
                                    2011-05-23 11:18:08.975: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3897] Init gpnp local security key proveders 2 of 2: OLR wallet (LSKP-CLSW-OLR)
                                    [   CLWAL][3883386624]clsw_Initialize: OLR initlevel [70000]
                                    2011-05-23 11:18:08.978: [    GPnP][3883386624]clsgpnp_InitCKProviders: [at clsgpnp0.c:3919] Init gpnp local security key provider 2 of 2: OLR wallet (LSKP-CLSW-OLR) OK
                                    2011-05-23 11:18:08.978: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;7. (2 providers - fatal if all fail)
                                    2011-05-23 11:18:08.978: [    GPnP][3883386624]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    2011-05-23 11:18:08.995: [    GPnP][3883386624]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/cwallet.sso'
                                    2011-05-23 11:18:08.995: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
                                    2011-05-23 11:18:08.995: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
                                    2011-05-23 11:18:09.000: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1950] <Get gpnp security keys (wallet) for id:1,typ;4. (2 providers - fatal if all fail)
                                    2011-05-23 11:18:09.000: [    GPnP][3883386624]clsgpnpkwf_getWalletPath: [at clsgpnpkwf.c:498] req_id=1 ck_prov_id=1 wallet path: /u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/
                                    2011-05-23 11:18:09.014: [    GPnP][3883386624]clsgpnpwu_walletfopen: [at clsgpnpwu.c:494] Opened SSO wallet: '/u01/app/11.2.0/grid/gpnp/orarac1/wallets/peer/cwallet.sso'
                                    2011-05-23 11:18:09.015: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1965] Result: (0) CLSGPNP_OK. Get gpnp wallet - provider 1 of 2 (LSKP-FSW(1))
                                    2011-05-23 11:18:09.015: [    GPnP][3883386624]clsgpnp_getCK: [at clsgpnp0.c:1982] Got gpnp security keys (wallet).>
                                    2011-05-23 11:18:09.015: [    GPnP][3883386624]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=4918, tl=3, f=3
                                    2011-05-23 11:18:09.029: [GIPCXCPT][3883386624]gipcShutdownF: skipping shutdown, count 3, from [ clsinet.c : 1732], ret gipcretSuccess (0)
                                    2011-05-23 11:18:09.031: [GIPCXCPT][3883386624]gipcShutdownF: skipping shutdown, count 2, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
                                    2011-05-23 11:18:09.032: [GIPCGMOD][3883386624]gipcmodGipcPassInitializeNetwork: using host information 10.10.1.7
                                    2011-05-23 11:18:09.036: [    CSSD][3883386624]clssnmOpenGIPCEndp: listening on gipc://orarac1:nm_orarac-cluster#10.10.1.7#62946
                                    2011-05-23 11:18:09.036: [    CSSD][3883386624]clssnmInitNMInfo: Initializing uniqueness 0
                                    2011-05-23 11:18:09.037: [    CSSD][3883386624]clssnmReadDiscoveryProfile: voting file discovery string()
                                    2011-05-23 11:18:09.037: [    CSSD][3883386624]clssnkInit: NK generic layer initializing.
                                    2011-05-23 11:18:09.038: [    CSSD][3883386624]clssscGetParameterOLR: OLR fetch for parameter GIPC NM trclvl (12) failed with rc 21
                                    2011-05-23 11:18:09.046: [   SKGFD][3613386496]NOTE: No asm libraries found in the system
                                    
                                    2011-05-23 11:18:09.046: [    CLSF][3613386496]Allocated CLSF context
                                    2011-05-23 11:18:09.046: [    CSSD][3613386496]clssnmvDDiscThread: using discovery string  for initial discovery
                                    2011-05-23 11:18:09.046: [   SKGFD][3613386496]Discovery with str::
                                    
                                    2011-05-23 11:18:09.046: [   SKGFD][3613386496]UFS discovery with ::
                                    
                                    2011-05-23 11:18:09.046: [   SKGFD][3613386496]Fetching UFS disk :/dev/raw/rawctl:
                                    
                                    2011-05-23 11:18:09.046: [    CLSF][3613386496]Ignoring 0-byte file /dev/raw/rawctl
                                    
                                    2011-05-23 11:18:09.046: [   SKGFD][3613386496]OSS discovery with ::
                                    
                                    2011-05-23 11:18:09.047: [    CSSD][3623876352]clssscSelect: cookie accept request 0x252f8a8
                                    2011-05-23 11:18:09.047: [    CSSD][3623876352]clssgmAllocProc: (0x7f7ed002b9d0) allocated
                                    2011-05-23 11:18:09.047: [    CSSD][3623876352]clssgmClientConnectMsg: properties of cmProc 0x7f7ed002b9d0 - 1,2,3,4
                                    2011-05-23 11:18:09.047: [    CSSD][3623876352]clssgmClientConnectMsg: Connect from con(0x16b) proc(0x7f7ed002b9d0) pid(4899) version 11:2:1:4, properties: 1,2,3,4
                                    2011-05-23 11:18:09.047: [    CSSD][3623876352]clssgmClientConnectMsg: msg flags 0x0000
                                    2011-05-23 11:18:09.047: [    CSSD][3613386496]clssnmvDiskVerify: Successful discovery of 0 disks
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]clssnmvFindInitialConfigs: No voting files found
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]###################################
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]clssscExit: CSSD signal 11 in thread clssnmvDDiscThread
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]###################################
                                    2011-05-23 11:18:09.048: [    CSSD][3623876352]clssgmClientShutdown: total iocapables 0
                                    2011-05-23 11:18:09.048: [    CSSD][3623876352]clssgmClientShutdown: graceful shutdown completed.
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]
                                    
                                    ----- Call Stack Trace -----
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]calling              call     entry                argument values in hex
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]location             type     point                (? means dubious value)
                                    2011-05-23 11:18:09.048: [    CSSD][3613386496]-------------------- -------- -------------------- ----------------------------
                                    2011-05-23 11:18:09.060: [    CSSD][3613386496]clssscExit()+594     call     kgdsdst()            000000000 ? 000000000 ?
                                    2011-05-23 11:18:09.060: [    CSSD][3613386496]                                                   7F7ED75F8CD8 ? 000000001 ?
                                    2011-05-23 11:18:09.060: [    CSSD][3613386496]                                                   7F7ED75FD1D8 ? 000000000 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]s0clsssc_sighandler  call     clssscExit()         002744F40 ? 000000001 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]()+611                                             7F7ED75F8CD8 ? 000000001 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]                                                   7F7ED75FD1D8 ? 000000000 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]__restore_rt()       call     s0clsssc_sighandler  00000000B ? 000000001 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]                              ()                   7F7ED75F8CD8 ? 000000001 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]                                                   7F7ED75FD1D8 ? 000000000 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]clssnmCompleteInitV  signal   __restore_rt()       000000001 ? 000000000 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]FDiscovery()+185                                   00259DF28 ? 002434B88 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]clssnmvDDiscThread(  call     clssnmCompleteInitV  002744F40 ? 00259DF10 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496])+2062                        FDiscovery()         002744EE0 ? 000000001 ?
                                    2011-05-23 11:18:09.061: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]clssscthrdmain()+20  call     clssnmvDDiscThread(  002744F40 ? 00259DF10 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]6                             )                    002744EE0 ? 000000001 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]start_thread()+209   call     clssscthrdmain()     002744F40 ? 00259DF10 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259DF10 ? 000000001 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]clone()+109          call     start_thread()       7F7ED75FE700 ? 00259DF10 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259DF10 ? 000000001 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]0000000000000000     call     clone()              7F7ED75FE700 ? 00259DF10 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259DF10 ? 000000001 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]                                                   00259F3B0 ? 000001340 ?
                                    2011-05-23 11:18:09.062: [    CSSD][3613386496]
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]--------------------- Binary Stack Dump ---------------------
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]========== FRAME [1] (clssscExit()+594 -> kgdsdst()) ==========
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]defined by frame pointers 0x7f7ed75fd610  and 0x7f7ed75fd540
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]CALL TYPE: call   ERROR SIGNALED: no   CALLER: clssscExit
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]RDI 0000000000000000 RSI 0000000000000000 RDX 00007F7ED75F8CD8
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]RCX 0000000000000001 R8 00007F7ED75FD1D8 R9 0000000000000000
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]RAX 0000000000000000 RBX 0000000002744F70 RBP 00007F7ED75FD610
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]R10 D75FD55000000000 R11 0000000000000000 R12 00007FFF0CBCCF40
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]R13 00007F7ED75FE9C0 R14 0000000000000004 R15 0000000000000007
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]RSP 00007F7ED75FD550 RIP 0000000000447102
                                    2011-05-23 11:18:09.063: [    CSSD][3613386496]
                                    Dump of memory from 0x7f7ed75fd540 to 0x7f7ed75fd610
                                    It definitely looks like it can't find the voting disks but my asm disks are fine:
                                    oracleasm status
                                    Checking if ASM is loaded: yes
                                    Checking if /dev/oracleasm is mounted: yes
                                    
                                    oracleasm listdisks
                                    CRSVOL01
                                    CRSVOL02
                                    CRSVOL03
                                    DATAVOL01
                                    DATAVOL02
                                    FRAVOL01
                                    FRAVOL02
                                    
                                    ll /dev/oracleasm/disks/
                                    total 0
                                    brw-rw----. 1 grid asmadmin 8,  49 May 23 10:44 CRSVOL01
                                    brw-rw----. 1 grid asmadmin 8,  50 May 23 10:44 CRSVOL02
                                    brw-rw----. 1 grid asmadmin 8,  51 May 23 10:44 CRSVOL03
                                    brw-rw----. 1 grid asmadmin 8,  81 May 23 10:44 DATAVOL01
                                    brw-rw----. 1 grid asmadmin 8,  33 May 23 10:44 DATAVOL02
                                    brw-rw----. 1 grid asmadmin 8,  97 May 23 10:44 FRAVOL01
                                    brw-rw----. 1 grid asmadmin 8, 113 May 23 10:44 FRAVOL02
                                    Please can someone give me any clue what I can look at next?
                                    1 2 3 Previous Next