You need atleast two failgroups to create normal redundancy diskgroup, if you do not specify failure groups during diskgroup creation, each disk will be kept by default under its own failgroup.
1) What are advantages for having each disk an own failure group, compare to having separate failure groups?
I am not sure if there are any defined advantages, its just the requirement, if your each disk is capable of storing entire data, creating many failgroups(each disk in its own) will provide more failure groups for data to rebalance if one failure group is lost.
2) Suppose if I have two failure groups ( FAL 1, FAL 2) with two disks each - If one failure group (FAL 1) is crashed (two disks became offline) - Do i lose my data?
Data in ASM is mirror as extents, if first primary extent is in Failgroup FAL1, its mirror will be in FAL2 and if second primary extent is in FAL2 its mirror will be in FAL1 , so if your entire FAL1 is crashed, you won't loose any data as mirror extents in FAL2 would be sufficient to access, but ASM will try to relocate the extents from FAL1 to any other Failgroup if there are any , if not diskgroup will operate with reduced redundancy(external redundancy).
3) Also as per my current environment architecture (normal redundancy) i.e., every disk an own failure group - In this case If two disk became offline - Do i lose my data? or my ASM stop working?
If you have 4 disks with each disk in its own failgroup (like fal1, fal2, fal3, fal4) and two disks(fal1, fal2) are lost, there is a chance of losing data in case if any primary extent and its mirror extent are in fal1 and fal2.
1. Definition: "Failure groups define ASM disks that share a common potential failure mechanism. An example of a failure group is a set of SCSI disks sharing the same SCSI controller. Failure groups are used to determine which ASM disks to use for storing redundant copies of data." (https://docs.oracle.com/cd/B16351_01/doc/server.102/b14196/asm002.htm ). And in case of normal redundancy data will be distributed as described in https://uhesse.com/2015/01/15/brief-introduction-to-asm-mirroring/ .
2. No, you will not loose your date if your failure groups will be defined properly, i.e. FAL1 and FAL2 not sharing device failed: controller, switch, etc.
3. As of your architecture: 3 disks, 3 failgroups, normal redundancy, i.e. each extent in 2 copies. Having 1 disk/diskgroup failed, data from failed disk will be distributed between 2 survived disks. This takes some time. During this you may have only 1 copy of the extent. If the disk containing such extent also failed you will loose your data. If second disk failed after rebulance completed, no data will be lost.
Rebalance means that data from the failed disk will be distributed between 2 other disks. If there will be no enough free space on the survived disks your ASM will stop working.
If 2 of 3 disks will failed at once you will loose your data.
Thanks for the reply @Igor @rchem
So keeping separate failure group is good choice instead of disk in its own diskgroup right?
Our current Architecture is + DATA with 7 disk of each size 1 TB (Normal Redundancy) - Database size 2.5 TB - Now we are planning to move database to another host , so want to change the +DATA architecture - Can anyone please suggest me , what would be best approach way to design our +DATA diskgroup. I was most interested to keeping two separate failure groups (FAL1, FAL2) with 3 disk each. But our client SME is suggesting to keep each disk in its own diskgroup i.e, default normal redundancy.
Our main agenda in any circumstances we should not loose of our data.
Failure group should be considered base on the disk controller.
How did your disks provided to ASM? Are there a physical spindles or LUNs provided from SAN (FC, iSCSI, dNFS)? Do you have redundancy (multiple channels with multiple HBA, switches) in disk provisioning?
1 person found this helpful
1) If you have 10 disks for a normal redundancy, and each disk is in its own failure group, a single disk will not hold both copies of a single ASM extent, so avoiding single point of failure. Normally all disks having tenancy to be failed together are kept in a single failure group.
2) If your diskgroup has NORMAL redundancy, ASM will keep one copy of each extent on one failure group, and other copy in the other one. So if you fail both disks of failure group FAL1, you are safe because other copy of each extent is still available in FAL2 failure group. But if your diskgroup is define with EXTERNAL redundancy, there is already one copy of each ASM extent (failure groups have no importance in EXTERNAL redudancy)
3) It depends. Suppose you have 10 disks with each disk in itw own failure group, ASM would keep 2 copies of each extent on 2 different disks (failure groups). If 2 disks fail simultaneously, there is a possibility that there are some extents having both copies on both failed disks, and you are in trouble in this case. But, good thing is that when one disks fails, a re-balance kicks in immediately and ASM would redistribute ASM extents again in a way that the mirrored extents that are gone because of a disk failure, a new mirror of those extents is created so that diskgroups full redundancy is restored. After re balance, if another disk fails, you are again safe because ASM has already restored redundancy, and after this second disk failure, ASM would once again try to restore redundancy.
Our instance in Amazon Ec2 instances - So it would be probably Provisioned IOPS SSD (
I did not work with Amazon. But most likely disk provisioned are LUNs from SAN. If they are RAID 10 and provisioned with multipath, I will use external redundancy in such a case. Having this each disk will be put in it own failgroup.
AWS does not give you the details on how the disks are provisioned underneath, but you can treat each EBS disk as a virtual LUN. Make sure you are using an EBS optimized EC2 instance and use at least 2000 PIOPS per EBS disk, more if you can afford it.
Yes its EBS disk as a virtual LUN with 2000 IOPS , but it was RAID 0.
So here is my final conclusion based on the this thread - I was choosing to go with Normal Redundancy with two failure groups (FAL 1 , FAL 2) with 3 disks each group and size of disk is 1 TB.
hope it will be fine architecture with no data loss - Incase if this is wrong choice , please guide me ?
There will be performance penalty for mirroring the DISKs over ASM which is already mirrored at storage.
RAID 0 is tripping. So, no redundancy and potential lost of data.
I will create ASM with normal redundancy over 6 disks.
Please find below additional information on striping and mirroring including a diagram illustration that I hope can assist in a better understanding of ASM.
Read the full article here: Verifying I/O Activity Balance Across Disks in ASM
The goal of striping is to maximize the storage subsystem throughput by balancing the I/O load across several disks, which results in a better performance. By doing so, the I/O latency will be reduced because balancing removes bottleneck from one specific disk.
Oracle ASM will stripe the data in small chunks of 128 KB for lower I/O latency for small I/O operations such as writing redo log entries to the redo log files (fine-grained striping). For data files, for example, Oracle ASM will stripe the data in bigger chunks that are equal to the Allocation Unit Size (coarse-grained striping).
Oracle ASM writes in a round-robin fashion across the disks in the disk group—this is why small disks will be filled up faster than large disks, which can cause more frequent rebalance operations. Therefore, a known best practice is to use disks with the same characteristics (such as identical size and performance).
Oracle ASM mirrors the data based on file extents. Each file extent contains one or more allocation units. Oracle ASM allows us to choose, for each Oracle ASM disk group, the desired redundancy level. We can choose to work with the following:
- Normal redundancy: Each file extent has a single copy (also called two-way mirroring).
- High redundancy: Each file extent has two copies (also called three-way mirroring).
- External redundancy (no mirroring).
Moreover, in addition to the ability to configure redundancy level per disk group, we can set the redundancy level for each specific file using Oracle ASM templates, which are another great advantage of Oracle ASM.
Oracle ASM mirrors copies of file extents in separate failure groups so that if all the disks in a failure group are lost, our database will continue to function properly because there are copies of the file extents in the other failure group or failure groups. Obviously, if you choose to work with external redundancy, there will be no failure groups in the Oracle ASM disk group. The external redundancy option is commonly used in environments that use storage solutions that already take care of the protection of and the distribution of their data across multiple disks using RAID. The most common RAID options for Oracle Database are RAID 1+0 (also called RAID 10) and RAID 5. Both provide data protection and striping across disks. In RAID 10, each separate set of disks (usually pairs) is mirrored individually and striping occurs on top of the mirrored sets of disks. RAID 5 distributes the data blocks as well as the parity blocks across all disks. The parity block can be used upon a disk failure for calculating the missing data stripe. RAID 10 is considered to be the best RAID option for Oracle Database in terms of performance.
During the disk creation ("ALTER DISKGROUP ... ADD" clause), disk deletion ("ALTER DISKGROUP ... DROP" clause"), and resize ("ALTER DISKGROUP ... RESIZE" clause) operations, Oracle ASM ensures that the file extents will be equally distributed across the disks in the Oracle ASM disk group so the storage should be balanced across the disks in the Oracle ASM disk group.
Figure 1 contrasts external redundancy with no Oracle ASM mirroring, normal two-way mirroring provided by Oracle ASM, and high-redundancy three-way mirroring provided by Oracle ASM. Each square within the disks represents an Oracle ASM file extent.