I have setup a two-node ASM cluster that provides high-level redundancy for my OVM cluster by exporting ADVM volumes via iSCSI. There are 3 FGs setup of 3x3 shared LUNs.
This setup behaves the way I expect it to, but there's one thing that bothers me: asmResilver2.
I tried several fail-over scenarios, one of them being that a node suddenly blanks out. The ADVM volume worked as expected, but upon the 2nd node coming back online, the ADVM mirror recovery kicked in, although all FGs had been online all the time. This ADVM volume is 6T in size. I'd assume that the AMS nodes would need to get their caches straight, but the resilver has been going on for 32+hrs now, and I really can't imagine, why it would be so, as it was only the ASM node, that got nuked.
Anybody can shed some light on why resilvering is taking so long and if there is any option to monitor what it is currently doing or how far it is into the process?
So, I have dived a bit deeper into this issue and found out, that immediately upon logging into the 6TB ASM volume, the asmResilver starts for this one. I have also a 2nd, smaller 100G ASM volume, that gets shared the same way, which does not pose this issue.
Is there maybe somewhere a limit for this?