6 Replies Latest reply: Apr 5, 2013 4:10 AM by user9043667 RSS

    Could not contact Oracle High Availability Services

    user9043667
      Hi all,

      I setup Oracle 11g R2 RAC on 2 Solaris 10x64 Machines.

      Issue: Restarted one node and i have seen ASM instance is down and the entire ASM disks are not mounted on Node1.

      Troubleshooted output:
      bash-3.00# ./crsctl check crs
      CRS-4639: Could not contact Oracle High Availability Services
      bash-3.00# ./crsctl stat res -t
      CRS-4535: Cannot communicate with Cluster Ready Services
      CRS-4000: Command Status failed, or completed with errors.
      bash-3.00# ./ocrcheck
      PROT-602: Failed to retrieve data from the cluster registry
      PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfolclcpi1, dep=204, loc=kgfokge
      AMDU-00204: Disk N0012 is in currently mounted diskgroup OCR2
      AMDU-00201: Disk N0012: '/dev/rdsk/c4t13d0s4'] [8]

      Logs from GRID_HOME/log/Node1

      2013-04-04 16:25:15.953
      [client(1291)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
      2013-04-04 16:25:50.752
      [client(1291)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_1291.log.
      2013-04-04 16:27:01.315
      [client(1318)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
      2013-04-04 16:27:36.284
      [client(1318)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_1318.log.
      2013-04-04 16:27:39.656
      [client(1318)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
      2013-04-04 16:28:13.822
      [client(1318)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_1318.log.
      2013-04-04 16:30:53.393
      [client(1336)]Invalid msg id [13]
      2013-04-04 16:33:24.125
      [client(1354)]Invalid msg id [13]
      2013-04-04 16:40:42.680
      [client(1463)]Invalid msg id [13]
      2013-04-04 17:04:50.892
      [client(2107)]Invalid msg id [13]
      2013-04-04 17:04:58.935
      [client(2116)]Invalid msg id [13]
      2013-04-04 17:07:04.611
      [client(2126)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_2126.log.
      2013-04-04 17:07:04.625
      [client(2126)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_2126.log.
      2013-04-05 09:57:08.690
      [client(2846)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
      2013-04-05 09:57:43.518
      [client(2846)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrcheck_2846.log.
      2013-04-05 09:57:46.902
      [client(2846)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
      2013-04-05 09:58:21.425
      [client(2846)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrcheck_2846.log.


      Is there any way to up ASM instance or CRS.

      Regards,
      Bob
        • 1. Re: Could not contact Oracle High Availability Services
          Hemant K Chitale
          Surely you should address the error :
          The OCR location in an ASM disk group is inaccessible


          Apparently the node is disconnected from the storage ?




          Hemant K Chitale
          • 2. Re: Could not contact Oracle High Availability Services
            user9043667
            Hi Hemant,

            Thanks for the update.
            But when i echo i can see my disks.

            bash-3.00# echo | format
            Searching for disks...done


            AVAILABLE DISK SELECTIONS:
            0. c2d0 <DEFAULT cyl 9726 alt 2 hd 255 sec 63>
            /pci@0,0/pci-ide@1f,2/ide@0/cmdk@0,0
            1. c4t10d0 <DEFAULT cyl 1824 alt 2 hd 255 sec 63>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-dataFFFF,0
            2. c4t11d0 <ROCKET-IMAGEFILE-0001 cyl 1171 alt 2 hd 255 sec 63>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-archlogFFFF,0
            3. c4t12d0 <DEFAULT cyl 1020 alt 2 hd 64 sec 32>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-vote3FFFF,0
            4. c4t13d0 <DEFAULT cyl 1020 alt 2 hd 64 sec 32>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-vote2FFFF,0
            5. c4t14d0 <DEFAULT cyl 1020 alt 2 hd 64 sec 32>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-vote1FFFF,0
            6. c4t15d0 <DEFAULT cyl 1020 alt 2 hd 64 sec 32>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-ocr2FFFF,0
            7. c4t16d0 <DEFAULT cyl 1020 alt 2 hd 64 sec 32>
            /iscsi/disk@0000iqn.2008-08.com.starwindsoftware%3A172.31.3.125-ocr1FFFF,0
            Specify disk (enter its number): Specify disk (enter its number):
            bash-3.00#

            Regards,
            Bob
            • 3. Re: Could not contact Oracle High Availability Services
              Hemant K Chitale
              What do the log files show ?

              /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_2126.log
              export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrcheck_2846.log


              Hemant K Chitale
              • 4. Re: Could not contact Oracle High Availability Services
                user9043667
                Hi Hemant,

                Here are the logs.

                bash-3.00# cat /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrconfig_2126.log
                Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
                2013-04-04 17:07:02.916: [ OCRCONF][1]ocrconfig starts...
                2013-04-04 17:07:02.917: [ OCRCONF][1]Importing OCR data from +OCR1


                bash-3.00# cat /export/home/u01/oracle/app/11.2.0/grid/log/node1/client/ocrcheck_2846.log
                Oracle Database 11g Clusterware Release 11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.
                2013-04-05 09:57:05.366: [OCRCHECK][1]ocrcheck starts...
                bash-3.00#


                Regards,
                Bob
                • 5. Re: Could not contact Oracle High Availability Services
                  Santysharma-Oracle
                  Hi,

                  Please check whether cssd and ASM are up first :

                  crsctl stat res -t -init

                  next check ASM instance alert log file for any issue with OCR diskgroup.

                  Regards,
                  Sharma
                  • 6. Re: Could not contact Oracle High Availability Services
                    user9043667
                    Hi Santy,

                    Here is the output.

                    bash-3.00# ./crsctl stat res -t -init
                    CRS-4639: Could not contact Oracle High Availability Services
                    CRS-4000: Command Status failed, or completed with errors.

                    ASM Log output..

                    NOTE: cache began mount (first) of group VOTE_DG number=5 incarn=0x9b70450e
                    NOTE: Assigning number (1,0) to disk (/dev/rdsk/c4t10d0s4)
                    NOTE: Assigning number (2,0) to disk (/dev/rdsk/c4t11d0s4)
                    NOTE: Assigning number (3,0) to disk (/dev/rdsk/c4t12d0s4)
                    NOTE: Assigning number (4,0) to disk (/dev/rdsk/c4t13d0s4)
                    NOTE: Assigning number (5,0) to disk (/dev/rdsk/c4t14d0s4)
                    NOTE: Assigning number (5,1) to disk (/dev/rdsk/c4t15d0s4)
                    NOTE: Assigning number (5,2) to disk (/dev/rdsk/c4t16d0s4)
                    Mon Apr 01 14:33:18 2013
                    NOTE: start heartbeating (grp 1)
                    kfdp_query(DATA): 7
                    kfdp_queryBg(): 7
                    NOTE: cache opening disk 0 of grp 1: DATA_0000 path:/dev/rdsk/c4t10d0s4
                    NOTE: F1X0 found on disk 0 au 2 fcn 0.0
                    NOTE: cache mounting (first) external redundancy group 1/0x9B40450A (DATA)
                    Mon Apr 01 14:33:19 2013
                    * allocate domain 1, invalid = TRUE
                    Mon Apr 01 14:33:19 2013
                    NOTE: attached to recovery domain 1
                    NOTE: starting recovery of thread=1 ckpt=8.216 group=1 (DATA)
                    NOTE: advancing ckpt for thread=1 ckpt=8.223
                    NOTE: cache recovered group 1 to fcn 0.3049
                    Mon Apr 01 14:33:19 2013
                    NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (DATA)
                    NOTE: LGWR found thread 1 closed at ABA 8.222
                    NOTE: LGWR mounted thread 1 for diskgroup 1 (DATA)
                    NOTE: LGWR opening thread 1 at fcn 0.3049 ABA 9.223
                    NOTE: cache mounting group 1/0x9B40450A (DATA) succeeded
                    NOTE: cache ending mount (success) of group DATA number=1 incarn=0x9b40450a
                    NOTE: start heartbeating (grp 2)
                    kfdp_query(FRA): 9
                    kfdp_queryBg(): 9
                    NOTE: cache opening disk 0 of grp 2: FRA_0000 path:/dev/rdsk/c4t11d0s4
                    NOTE: F1X0 found on disk 0 au 2 fcn 0.0
                    NOTE: cache mounting (first) external redundancy group 2/0x9B50450B (FRA)
                    * allocate domain 2, invalid = TRUE
                    NOTE: attached to recovery domain 2
                    NOTE: starting recovery of thread=1 ckpt=7.41 group=2 (FRA)
                    NOTE: advancing ckpt for thread=1 ckpt=7.41
                    NOTE: cache recovered group 2 to fcn 0.722
                    NOTE: LGWR attempting to mount thread 1 for diskgroup 2 (FRA)
                    NOTE: LGWR found thread 1 closed at ABA 7.40
                    NOTE: LGWR mounted thread 1 for diskgroup 2 (FRA)
                    NOTE: LGWR opening thread 1 at fcn 0.722 ABA 8.41
                    NOTE: cache mounting group 2/0x9B50450B (FRA) succeeded
                    NOTE: cache ending mount (success) of group FRA number=2 incarn=0x9b50450b
                    NOTE: start heartbeating (grp 3)
                    kfdp_query(OCR1): 11
                    kfdp_queryBg(): 11
                    NOTE: cache opening disk 0 of grp 3: OCR1_0000 path:/dev/rdsk/c4t12d0s4
                    NOTE: F1X0 found on disk 0 au 2 fcn 0.0
                    NOTE: cache mounting (first) external redundancy group 3/0x9B60450C (OCR1)
                    * allocate domain 3, invalid = TRUE
                    NOTE: attached to recovery domain 3
                    NOTE: starting recovery of thread=1 ckpt=7.43 group=3 (OCR1)
                    NOTE: advancing ckpt for thread=1 ckpt=7.43
                    NOTE: cache recovered group 3 to fcn 0.724
                    NOTE: LGWR attempting to mount thread 1 for diskgroup 3 (OCR1)
                    NOTE: LGWR found thread 1 closed at ABA 7.42
                    NOTE: LGWR mounted thread 1 for diskgroup 3 (OCR1)
                    NOTE: LGWR opening thread 1 at fcn 0.724 ABA 8.43
                    NOTE: cache mounting group 3/0x9B60450C (OCR1) succeeded
                    NOTE: cache ending mount (success) of group OCR1 number=3 incarn=0x9b60450c
                    NOTE: start heartbeating (grp 4)
                    kfdp_query(OCR2): 13
                    kfdp_queryBg(): 13
                    NOTE: cache opening disk 0 of grp 4: OCR2_0000 path:/dev/rdsk/c4t13d0s4
                    NOTE: F1X0 found on disk 0 au 2 fcn 0.0
                    NOTE: cache mounting (first) external redundancy group 4/0x9B60450D (OCR2)
                    * allocate domain 4, invalid = TRUE
                    NOTE: attached to recovery domain 4
                    NOTE: starting recovery of thread=1 ckpt=7.41 group=4 (OCR2)
                    NOTE: advancing ckpt for thread=1 ckpt=7.41
                    NOTE: cache recovered group 4 to fcn 0.717
                    NOTE: LGWR attempting to mount thread 1 for diskgroup 4 (OCR2)
                    NOTE: LGWR found thread 1 closed at ABA 7.40
                    NOTE: LGWR mounted thread 1 for diskgroup 4 (OCR2)
                    NOTE: LGWR opening thread 1 at fcn 0.717 ABA 8.41
                    NOTE: cache mounting group 4/0x9B60450D (OCR2) succeeded
                    NOTE: cache ending mount (success) of group OCR2 number=4 incarn=0x9b60450d
                    NOTE: start heartbeating (grp 5)
                    kfdp_query(VOTE_DG): 15
                    kfdp_queryBg(): 15
                    NOTE: cache opening disk 0 of grp 5: VOTE_DG_0000 path:/dev/rdsk/c4t14d0s4
                    NOTE: F1X0 found on disk 0 au 2 fcn 0.0
                    NOTE: cache opening disk 1 of grp 5: VOTE_DG_0001 path:/dev/rdsk/c4t15d0s4
                    NOTE: F1X0 found on disk 1 au 2 fcn 0.0
                    NOTE: cache opening disk 2 of grp 5: VOTE_DG_0002 path:/dev/rdsk/c4t16d0s4
                    NOTE: F1X0 found on disk 2 au 2 fcn 0.0
                    NOTE: cache mounting (first) normal redundancy group 5/0x9B70450E (VOTE_DG)
                    * allocate domain 5, invalid = TRUE
                    NOTE: attached to recovery domain 5
                    Mon Apr 01 14:33:21 2013
                    NOTE: starting recovery of thread=1 ckpt=4.12 group=5 (VOTE_DG)
                    NOTE: advancing ckpt for thread=1 ckpt=4.12
                    NOTE: cache recovered group 5 to fcn 0.73
                    NOTE: LGWR attempting to mount thread 1 for diskgroup 5 (VOTE_DG)
                    NOTE: LGWR found thread 1 closed at ABA 4.11
                    NOTE: LGWR mounted thread 1 for diskgroup 5 (VOTE_DG)
                    NOTE: LGWR opening thread 1 at fcn 0.73 ABA 5.12
                    NOTE: cache mounting group 5/0x9B70450E (VOTE_DG) succeeded
                    NOTE: cache ending mount (success) of group VOTE_DG number=5 incarn=0x9b70450e
                    Mon Apr 01 14:33:22 2013
                    kfdp_query(DATA): 16
                    kfdp_queryBg(): 16
                    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 1
                    SUCCESS: diskgroup DATA was mounted
                    kfdp_query(FRA): 17
                    kfdp_queryBg(): 17
                    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2
                    SUCCESS: diskgroup FRA was mounted
                    kfdp_query(OCR1): 18
                    kfdp_queryBg(): 18
                    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 3
                    SUCCESS: diskgroup OCR1 was mounted
                    kfdp_query(OCR2): 19
                    kfdp_queryBg(): 19
                    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 4
                    SUCCESS: diskgroup OCR2 was mounted
                    kfdp_query(VOTE_DG): 20
                    kfdp_queryBg(): 20
                    NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 5
                    SUCCESS: diskgroup VOTE_DG was mounted
                    SUCCESS: ALTER DISKGROUP ALL MOUNT /* asm agent */
                    SQL> ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */
                    SUCCESS: ALTER DISKGROUP ALL ENABLE VOLUME ALL /* asm agent */
                    Mon Apr 01 14:33:23 2013
                    WARNING: failed to online diskgroup resource ora.DATA.dg (unable to communicate with CRSD/OHASD)
                    WARNING: failed to online diskgroup resource ora.FRA.dg (unable to communicate with CRSD/OHASD)
                    WARNING: failed to online diskgroup resource ora.OCR1.dg (unable to communicate with CRSD/OHASD)
                    Mon Apr 01 14:33:28 2013
                    Starting background process ASMB
                    Mon Apr 01 14:33:28 2013
                    ASMB started with pid=24, OS id=1105
                    WARNING: failed to online diskgroup resource ora.OCR2.dg (unable to communicate with CRSD/OHASD)
                    WARNING: failed to online diskgroup resource ora.VOTE_DG.dg (unable to communicate with CRSD/OHASD)
                    Mon Apr 01 14:34:23 2013
                    Reconfiguration started (old inc 2, new inc 4)
                    List of instances:
                    1 2 (myinst: 1)
                    Global Resource Directory frozen
                    Communication channels reestablished
                    Master broadcasted resource hash value bitmaps
                    Non-local Process blocks cleaned out
                    Mon Apr 01 14:34:23 2013
                    LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
                    Set master node info
                    Submitted all remote-enqueue requests
                    Dwn-cvts replayed, VALBLKs dubious
                    All grantable enqueues granted
                    Submitted all GCS remote-cache requests
                    Fix write in gcs resources
                    Reconfiguration complete
                    Mon Apr 01 14:34:39 2013
                    ALTER SYSTEM SET local_listener='(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node1-vip)(PORT=1521))))' SCOPE=MEMORY SID='+ASM1';
                    bash-3.00#

                    Regards,
                    Bob