6 Replies Latest reply: Mar 6, 2014 5:27 AM by ASulthan RSS

Voting Disk configuration using two different storage devices

user8916506 Newbie
Currently Being Moderated

I have configure 2 nodes 11.2.0.4 RAC cluster using two different storage devices. For OCR and voting files, I created high redundancy diskgroup, +CRS_DATA, with 5 disks in the group. Four disks were on storage 1 and 1 disk was on storage 2. I then removed the network cable from storage 2 and both RAC nodes became un-operational though majority of voting disks were available to both nodes. Even rebooting of the nodes did not solve the problem until I put the network cable back in the second storage. As through different articles on RAC, I have read that if majority of voting disks are visible to a node it remains operational but in my experiment it did not seem like so.

Please guide me if I made any mistake in my above experiment.

  • 1. Re: Voting Disk configuration using two different storage devices
    Tom321 Journeyer
    Currently Being Moderated

    Hi,

     

    theoretically it should have worked in your experiment. Check how the ocr and votedisk are distributed among the disks

     

    as grid owner:

    crsctl query css votedisk

    ocrcheck

     

    Next get the allocation of disks and their failgroups in sqlplus on asm instance:

    set lines 300

    select group_number,disk_number,failgroup,substr(path,1,48) as "Path(48 Chars)" ,mount_status,header_status,mode_status,state,total_mb,free_mb

    from v$asm_disk

    order by group_number,disk_number;

     

    Beside this check the logs from clusterware, what errors does it show during the failed startup of crs stack after the node reboot?

     

    Regards

    Thomas

  • 2. Re: Voting Disk configuration using two different storage devices
    BeGin Pro
    Currently Being Moderated

    Hello,

     

    This looks like an extended cluster configuration, with 2 nodes , 2 storages.

    Are you sure the unavailablity of the cluster was due to voting ? I saw the exact same problem on this configuration where failure groups had not been defined for data, fra ...  Then when the link between storage was lost, the instances lost data from missing storage and shut down .  Both nodes where unavailable, and the only way to get the whole system back was to put the link back.

     

    Do you have some errors messages in the different logs of clusterwares or Db ?

     

    Anyway in an extended cluster recommended configuration for voting disk is to put the third one on a distant share seen by the two nodes, With this configuration whatever the storage you loose you always have two voting online and the majority is respected.

     

    Regards,

     

    --

    Bertrand

  • 3. Re: Voting Disk configuration using two different storage devices
    user2121 - -Oracle Pro
    Currently Being Moderated

    Hi,

    Can you check what error you got in $GRID_HOME/<nodename>/alert*

    and

    $GRD_HOME/nodename/ocssd/cssd.log during the problem timeframe

    Paste the error that you see in that

    probably that will give a clearer picture on what happened

  • 4. Re: Voting Disk configuration using two different storage devices
    user8916506 Newbie
    Currently Being Moderated

    [root@rac1 ~]# crsctl query css votedisk

    ##  STATE    File Universal Id                File Name Disk group

    --  -----    -----------------                --------- ---------

    1. ONLINE   5207cec7f7e54f87bfadec5f7e11b208 (ORCL:CRS1VOL1) [CRS_DATA]

    2. ONLINE   5b97c328ed0b4f6ebf5845b3cfd766f7 (ORCL:CRS2VOL1) [CRS_DATA]

    3. ONLINE   55974b2dc82c4f6cbf50bba2654c9f81 (ORCL:CRS3VOL1) [CRS_DATA]

    4. ONLINE   18bb92e904844f93bfae074aa35f76c0 (ORCL:CRS4VOL1) [CRS_DATA]

    5. OFFLINE  332856ef94de4faabf26706d74e6d789 (ORCL:CRS1VOL2) [CRS_DATA]

    Located 5 voting disk(s).

     

     

    [root@rac1 ~]# ocrcheck

    PROT-602: Failed to retrieve data from the cluster registry

    PROC-26: Error while accessing the physical storage

    ORA-15077: could not locate ASM instance serving a required diskgroup

     

    Last voting disk that is on storage 2 is offline

    I am not using fast recovery and for data diskgroup I have two fail groups one each storage.

     

    I would also like to mention here that when i removed network cable from storage 1 ,both rac nodes rebooted and that was not the case when i removed the network cable from storage 2 hosting single voting disk.

     

    the result of crsctl check cluster -all after removing cable from storage one was

     

    [oracle@rac2 ~]$ crsctl check cluster -all

    **************************************************************

    rac1:

    CRS-4535: Cannot communicate with Cluster Ready Services

    CRS-4529: Cluster Synchronization Services is online

    CRS-4533: Event Manager is online

    **************************************************************

    rac2:

    CRS-4535: Cannot communicate with Cluster Ready Services

    CRS-4529: Cluster Synchronization Services is online

    CRS-4533: Event Manager is online

    **************************************************************

  • 5. Re: Voting Disk configuration using two different storage devices
    Rahul_gupta Newbie
    Currently Being Moderated

    Hi,

     

     

    As per my observation the problem is not with the voting disk .Bcoz as per your word nodes are not getting rebooted when you are removing network cable from second storage which should be if it a case of missing Disk Heartbeat (When more than half number of voting disk is not accessible to a instance) .But this exactly happen when you remove cable from both of the storage


    Secondly


    The error you are getting is because OCR is not accessible and reason behind that is Your Asm Instance is  down.

     

    "ORA-15077: could not locate ASM instance serving a required diskgroup"

     

    So you i would suggest you to check ASM instance alert log file to understand why IT is getting down when you are removing network cable from second storage

     

     

    Regards

    Rahul Gupta

  • 6. Re: Voting Disk configuration using two different storage devices
    ASulthan Journeyer
    Currently Being Moderated

    hi

    if ocr is located in asm  or wiht differnet permission or wonership  srvctl will fail to start earlier database

     

    Restart the restart GI

     

    $GRID_HOME/bin/crsctl stat res

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points