Not trying to beat dead horse but I always had difficulty to understand despite of reading several books about Failure Groups. I am also sure there is something basic which I am missing. I would appreciate your help.
My doubt is -
Fault Tolerant between 2 Failure Groups Vs Many Failure Groups
I understand that
- Oracle uses failure groups to choose where to place mirrored extents
- We should have failure groups aligned with failure boundaries/disks sharing same dependency/infrastructure
I would like to articulate with an example
suppose I have 4 NAS Servers and each NAS server gives me 4 disks and I want to create normal redundancy diskgroup
NAS 1 - Disk1, Disk2, Disk3, Disk4
NAS 2 - Disk5, Disk6, Disk7, Disk8
NAS 3 - Disk9, Disk10, Disk11, Disk12
NAS 4 - Disk13, Disk14, Disk15, Disk16
Scenario (A) -- this is the ideal scenario I believe
FG 1 - Disk 1/2/3/4
FG 2 - Disk 5/6/7/8
FG 3 - Disk 9/10/11/12
FG 4 - Disk 13/14/15/16
Scenario (B) - I would like to create only 2 failure groups, each failure group will have disks from 2 NAS Servers
FG1 - Disk1/2/3/4/5/6/7/8 -- disks from 2 nas servers
FG2 - Disk 9/10/11/12/13/14/15/16 -- disks from 2 nas servers
with Scenario A -- we can tolerate 1 Failure group which means failure of 4 disks in one/same failure group
with Scenario B -- we can tolerate 1 Failure group which means failures of 8 disks in one/same failure group -- 2 NAS server failures in same failgroup can be tolerated.
If my above assumption is correct then I am not sure what would be the reason we should go more than 2 failure groups, we get more cover. I have also checked space availability in case we lost
one disks ( req mirror mb / usable file mb ) all looks good.
Could you please tell me what is the drawback of scenario B as compare to A
Thanks for your reply, I do have same expression but I get confused when I look at Exadata Implementation. I was trying to understand the concept behind that.
In full Rack - Exadata has 14 storage cells and they have put each storage cell in one failure group. again, I agree here because of failure boundaries but they can only tolerate one failuregroup = one cell,
Why did not Oracle go for 2 Failure Groups by putting 6 cells in each failure group, they could have tolerated upto 6 cells ( in same failure group )
That's the reason for whole confusion.
Again, thanks for your reply