This discussion is archived
7 Replies Latest reply: Jun 23, 2011 10:03 AM by Levi-Pereira RSS

11gR2 RAC - Voting disk mount_status = closed

Alessandro Zenoni Newbie
Currently Being Moderated
Hi all,
I have a configuration with 2 node and 11.2.0.1 installed.

Node 1: HPUX + Storage EVA C7000
Node 2: HPUX + Storage EVA C3000

For high availability test, we tryed to disconnect one EVA.

When we unplugged EVA C3000, cluster remained online.
When we unplugged EVA C7000, cluster sent shutdown command to both node.

I checked dozen of log, then I checke v$asm_disk.

In particular, I found that some voting disk were in mount_status = CLOSED. (they seems voting disk because their size were 1024kb).

So, I have 2 voting disk in CRS DISKGROUP and correctly online (I suppose disks of C7000) and 2 voting disk that are ONLINE, MEMBER but CLOSED, as shown in the query.

MS HS MOS STATE PATH
------- ------------ ------- -------- -------------------
CLOSED MEMBER ONLINE NORMAL /dev/rdisk/disk208
CLOSED MEMBER ONLINE NORMAL /dev/rdisk/disk193

MS HS MOS STATE PATH
------- ------------ ------- -------- -------------------
CACHED MEMBER ONLINE NORMAL /dev/rdisk/disk188
CACHED MEMBER ONLINE NORMAL /dev/rdisk/disk206

How can I put 'CACHED' also disks that are CLOSED?
With asmca I can't found any features to do it.

Thanks in advance.
Best Regards.

Alessandro
  • 1. Re: 11gR2 RAC - Voting disk mount_status = closed
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,

    In particular, I found that some voting disk were in mount_status = CLOSED. (they seems voting disk because their size were 1024kb).
    So, I have 2 voting disk in CRS DISKGROUP and correctly online (I suppose disks of C7000) and 2 voting disk that are ONLINE, MEMBER but CLOSED, as shown in the query.
    If I understand right you have configured even number of voting disks... rigth?

    You should configure odd number of voting disks.

    If you lose 1/2 or more of all of your voting disks, then nodes get evicted from the cluster, or nodes kick themselves out of the cluster. It doesn't threaten database corruption. Alternatively you can use external redundancy which means you are providing redundancy at the storage level using RAID.
    For this reason when using Oracle for the redundancy of your voting disks, Oracle recommends that customers use 3 or more voting disks in Oracle RAC 10g Release 2. Note: For best availability, the 3 voting files should be physically separate disks. It is recommended to use an odd number as 4 disks will not be any more highly available than 3 disks, 1/2 of 3 is 1.5...rounded to 2, 1/2 of 4 is 2, once we lose 2 disks, our cluster will fail with both 4 voting disks or 3 voting disks.


    Oracle Cluster Registry and Voting Files in Oracle ASM Disk Groups
    For Oracle Clusterware files a minimum of three disk devices or three failure groups is required with a normal redundancy disk group. A QUORUM failure group is not considered when determining redundancy requirements in respect to storing user data.

    http://download.oracle.com/docs/cd/E11882_01/server.112/e16102/asmdiskgrps.htm#OSTMG10204

    Storing Voting Disks on Oracle ASM

    Normal redundancy: A disk group with normal redundancy stores three voting disks
    If voting disks are stored on Oracle ASM with normal or high redundancy, and the storage hardware in one failure group suffers a failure, then if there is another disk available in a disk group in an unaffected failure group, Oracle ASM recovers the voting disk in the unaffected failure group.

    http://download.oracle.com/docs/cd/E11882_01/rac.112/e16794/votocr.htm#CWADD91889

    Please put here output of:
    $ crsctl query css votedisk
    Regards,
    Levi Pereira
  • 2. Re: 11gR2 RAC - Voting disk mount_status = closed
    585179 Expert
    Currently Being Moderated
    Hi,


    The status CLOSED means the disk is detected from OS view but its not being accessed by ASM. You need to make sure that ASM can access those disk and see if the status changed


    Cheers
  • 3. Re: 11gR2 RAC - Voting disk mount_status = closed
    Alessandro Zenoni Newbie
    Currently Being Moderated
    Hi,
    no I wrote wrong: I have 3 voting disks but 2 on EVA C7000 and 1 on EVA C3000.
    When C7000 hung up, remains only 1 voting disk and cluster crash.

    Outpu of crsctl query css votedisk

    ## STATE File Universal Id File Name Disk group
    -- ----- ----------------- --------- ---------
    1. ONLINE 1e4262c9e2f04fc6bff5e27c0439fa18 (/dev/rdisk/disk188) [CRS]
    2. ONLINE 2394257a565c4f92bf09cccb1a02a484 (/dev/rdisk/disk206) [CRS]
    3. ONLINE 1fb54d7acc3b4ff9bfc8eb54ae339723 (/dev/rdisk/disk193) [CRS]

    I need to add the fourth vote disk so I'll have 2 vote disk on each storage.

    Best regards
  • 4. Re: 11gR2 RAC - Voting disk mount_status = closed
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,
    I have 3 voting disks but 2 on EVA C7000 and 1 on EVA C3000.
    When C7000 hung up, remains only 1 voting disk and cluster crash.
    I need to add the fourth vote disk so I'll have 2 vote disk on each storage.
    It will not work.
    If you do this will worsen the situation.

    Look:

    If you lose 1/2 or more of all of your voting disks, then nodes get evicted from the cluster.
    Now you have:
    Voting Diks:
    1. /dev/rdisk/disk188 - EVA C3000
    2. /dev/rdisk/disk206 - EVA C7000
    3. /dev/rdisk/disk193 - EVA C7000


    If you lose EVA C3000 clusterware remains online because you have online more than 1/2 of all of voting disks.
    If you lose EVA C7000 then nodes get evicted from the cluster, because you lost more than 1/2.

    If you add

    4. /dev/rdisk/disk208 - EVA C3000

    You will have an even number of voting disks, if you lose either of storage arrays your nodes will get evicted.

    If you lose EVA C3000 you will have on EVA C7000 1/2 of all voting disks, so nodes will get evicted from the cluster.
    If you lose EVA C7000 you will have on EVA C3000 1/2 of all voting disks, so nodes will get evicted from the cluster.


    Solution:

    Add a third voting disk to a third site to host the quorum (voting)  at a location different from the main storages.
    http://download.oracle.com/docs/cd/B28359_01/server.111/b28282/configbp005.htm#CHDGAHAD


    1. /dev/rdisk/disk188 - EVA C3000
    2. /dev/rdisk/disk206 - EVA C7000
    3. /nfs_mount/voting_disk/vote_node1 - Third Location on NFS ( for example)

    Use this doc to accomplish this:
    Section: Adding a 3rd Voting File on NFS to a Cluster using Oracle ASM
    http://www.oracle.com/technetwork/database/clusterware/overview/grid-infra-thirdvoteonnfs-131158.pdf

    Regards,
    Levi Pereira
  • 5. Re: 11gR2 RAC - Voting disk mount_status = closed
    870555 Newbie
    Currently Being Moderated
    Hi Levi,
    thank you very much.
    I'll follow your suggest.

    Another request.
    I investigated about voting disk problem because the first time that the cluster fall down and then return online, in ocssd.log file I found something that I had never seen before:

    2011-05-19 19:26:43.383: [    CSSD][29]clssnmSendingThread: sending status msg to all nodes
    2011-05-19 19:26:43.383: [    CSSD][29]clssnmSendingThread: sent 5 status msgs to all nodes
    2011-05-19 19:26:43.978: [    CSSD][27]clssgmTagize: version(1), type(3), tagizer(9fffffffff3d26c8)
    2011-05-19 19:26:43.978: [    CSSD][27]clssgmHandleMasterMemberAdd: [s(2) d(1)]
    2011-05-19 19:26:43.978: [    CSSD][27]clssgmGrockOpTagProcess: clssgmCommonAddMember failed, member(-1/CLSN.RLB.AVS[3]) on node(2)
    2011-05-19 19:26:43.978: [    CSSD][27]clssgmGrockOpTagProcess: Operation(3) unsuccessful grock(CLSN.RLB.AVS[3])
    2011-05-19 19:26:43.978: [    CSSD][27]clssgmHandleMasterJoin: clssgmProcessJoinUpdate failed with status(-10)
    2011-05-19 19:26:44.017: [    CSSD][27]clssgmTagize: version(1), type(3), tagizer(9fffffffff3d26c8)
    2011-05-19 19:26:44.017: [    CSSD][27]clssgmHandleMasterMemberAdd: [s(2) d(1)]
    2011-05-19 19:26:44.017: [    CSSD][27]clssgmGrockOpTagProcess: clssgmCommonAddMember failed, member(-1/CLSN.RLB.DIAPPROD[3]) on node(2)
    2011-05-19 19:26:44.017: [    CSSD][27]clssgmGrockOpTagProcess: Operation(3) unsuccessful grock(CLSN.RLB.DIAPPROD[3])
    2011-05-19 19:26:44.017: [    CSSD][27]clssgmHandleMasterJoin: clssgmProcessJoinUpdate failed with status(-10)

    What is grock? And what does it mean?

    Thanks in advance.

    Best regards.
    Alessandro
  • 6. Re: 11gR2 RAC - Voting disk mount_status = closed
    Alessandro Zenoni Newbie
    Currently Being Moderated
    Sorry,
    I replied to your message with other account but I always me.

    Bye
    Alessandro
  • 7. Re: 11gR2 RAC - Voting disk mount_status = closed
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,

    Oracle Clusterware evicts the node from the cluster when:
    1. Node is not pinging via the network heartbeat
    2. Node is not pinging the Voting disk
    3. Node is hung/busy and is unable to perform either of the earlier tasks
    In Most cases when the node is evicted, there is information written to the logs to analyze the cause of the node eviction.
    What is grock? And what does it mean?
    I dont know what is a Grock, but I believe that is a function that checks communication between nodes.
    These functions are not documented and you will need to contact Oracle Support to find out.

    Regards,
    Levi Pereira

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points