This discussion is archived
4 Replies Latest reply: Mar 28, 2013 11:19 PM by 907779 RSS

Modify timeout for voting files

Christian Newbie
Currently Being Moderated
Hey, is there a way to modify the default timeout for the voting files. The idea: I have a (soft) grid storage. So I configured external redundancy for my ocr diskgroup where my votingfile is located.

When one of the mirrored storaged gets switched off, the system hungs for nearly 2 minutes. The voting timeout is set to 99 seconds, right ? This should be the default setting, at least according to the crs alert log.

Is it possible to modify this value ?

At the moment, the database gets shutdown, after expiring the timeout.

Christian
  • 1. Re: Modify timeout for voting files
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,
    I recommend you read the note *CSS Timeout Computation in Oracle Clusterware [ID 294430.1]* on MOS.

    This note will help you:

    <li>Define misscount parameter
    <li>Define the default calculations for the misscount parameter
    <li>Describe Cluster Synchronization Service (CSS) heartbeats and their interrelationship
    <li>Describe the cases where the default calculation may be too sensitive

    CSS Timeout Computation in Oracle Clusterware
    The CSS misscount parameter represents the maximum time, in seconds, that a network heartbeat can be missed before entering into a cluster reconfiguration to evict the node.

    Regards,
    Levi Pereira
  • 2. Re: Modify timeout for voting files
    Christian Newbie
    Currently Being Moderated
    Hey Levi,

    I need the timeout parameter for the I/O timeouts.

    See my logfile:
    cssd(11473)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file ORCL:VOTE1 will be considered not functional in 99760 milliseconds
    2011-12-20 20:57:04.614
    [cssd(11473)]CRS-1614:No I/O has completed after 75% of the maximum interval. Voting file ORCL:VOTE1 will be considered not functional in 49760 milliseconds
    2011-12-20 20:57:34.610
    [cssd(11473)]CRS-1613:No I/O has completed after 90% of the maximum interval. Voting file ORCL:VOTE1 will be considered not functional in 19760 milliseconds
    2011-12-20 20:57:41.140
    [cssd(11473)]CRS-1649:An I/O error occured for voting file: ORCL:VOTE1; details at (:CSSNM00059:) in /crs/log/host1/cssd/ocssd.log.
  • 3. Re: Modify timeout for voting files
    Levi-Pereira Guru
    Currently Being Moderated
    Hi,
    This note help you with i/o timeout, but I belive it's not your problem.

    See:
    The synchronization services component (CSS) of the Oracle Clusterware maintains two heartbeat mechanisms 1.) the disk heartbeat to the voting device and 2.) the network heartbeat across the interconnect which establish and confirm valid node membership in the cluster. Both of these heartbeat mechanisms have an associated timeout value. The disk heartbeat has an internal i/o timeout interval (DTO Disk TimeOut), in seconds, where an i/o to the voting disk must complete. The misscount parameter (MC), as stated above, is the maximum time, in seconds, that a network heartbeat can be missed. The disk heartbeat i/o timeout interval is directly related to the misscount parameter setting.

    Modifying the default value of misscount not only influences the timeout interval for the i/o to the voting disk, but also influences the tolerance for missed network heartbeats across the interconnect.

    Misscount should NOT be modified to workaround the below-mentioned issues.
    QLogic HBA cards with a Link Down Timeout greater than the default misscount.
    Bad cables to the SAN/storage array that effect i/o latencies
    SAN switch (like Brocade) failover latency greater than the default misscount
    EMC Clariion Array when trespassing the SP to the backup SP greater than default misscount
    EMC PowerPath path error detection and I/O repost and redirect greater than default misscount
    Poor SAN network configuration that creates latencies in the I/O path.
    So I configured external redundancy for my ocr diskgroup where my votingfile is located. When one of the mirrored storaged gets switched off, the system hungs for nearly 2 minutes.
    As you are using external redundancy Oracle does not know that there is a mirrored disk from behind.
    Perhaps the OS or Storage are holding I/O when you stop the mirroring due to a misconfiguration. I believe this problem is related to OS or Storage not the Oracle Clusterware.
    If you perform this test with the diskgroup (external redundancy) that store data will have the same result.


    Regards,
    Levi Pereira
  • 4. Re: Modify timeout for voting files
    907779 Newbie
    Currently Being Moderated
    Hi Chirstian,

    did you solve your problem? We have the same issue on our 4 Node RAC while doing a failover in the SAN Virtualisation Appliance.

    Grüße aus Tirol
    Stefan

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points