OCFS2 - Global heartbeat - Timeouts
Hi,
In the context of OVM 3.4.6 / multipath iSCSI / OCFS2
We had storage array problem that lasted 20 minutes. The storage array restarted entirely.
That storage array hosts 2 iSCSI repositories and the iSCSI poolFs file system for 6 hypervisors.
We had the good surprise that all hypervisors waited (for 20 minutes !!!) and restored iSCSI connexion gracefully. Almost all VMs survived that long disk I/O pause.
I had the experience that a 2 minutes outage would make hypervisor to fence and reboot.
Where does such good resilience comes from ?
For what it worths, please find below our modifications done to iSCSI.conf and multipath.conf