when ZS3-4 cluster takeover/failover frequently within 1~2 days, what is checkpoint to fix problem ?
Hello Exports.
My customer has ZS3-4 ZFSSA (active/active).
This ZS3-4 was some problems.
1st. node1 4x4 SAS HBA had a fault status. (PCIE-8 slot)
Error message : The transmitting device sent an invalid request.
node1 was takeovered to node2 at that time.
-> Replaced it. and marked repaired manually.
2nd. node2 4x4 SAS HBA had a fault status (PCIE-8 slot) / the 1st node1's SAS HBA fault since two months later
Error message : The transmitting device sent an invalid request.
node2 was takeovered to node1 at that time.
-> Replaced it. and marked repaired manually.
3rd. node1 was panic. and CPU0/P0 was fault status. ILOM detect its fault. But AKD cannot detected as fault.