1 2 3 Previous Next 35 Replies Latest reply: Mar 12, 2012 10:02 AM by __OUTSIDER___ RSS

    ASM HA

    __OUTSIDER___
      Hi dear experts !

      I have Grid Infrastructure 11gR2, RAC 2 node installed with ASM.(OEL 5.7 UEK)
      ASM is configured with 3 diskgroup (normal redundancy ). First diskgroup contain
      5 Virtual Disks and each disk is assigned to its own failure group. Also I have two failure
      group each of which have 3 Virtual Disks (Oracle VM 3.0.3). Virtual Disks are connected to server via NFS.
      Currently I am testing ASM behavior in situation when one of this failure group will be down.
      As my ASM disks are virtual and each failure group (3 v. disks) are located in separate machine
      I simulate crash the disks by dropping network interface. So after I kill connection between
      Linux and storage (NFS) my DB (with 2 failure groups) has crashed. When connection
      is became available Linux and ASM instance without reboot can collaborate with disks and
      I manually *(srvctl start database -d test)* start DB.
      My question is how can I configure (if Oracle supports of course )
      ASM that DB continues to work even after one of disk group become unavailable.
      Can I set up something ASM HA ?

      any ideas are welcome


      ./thanks
        • 1. Re: ASM HA
          Billy~Verreynne
          ASM being available depends on the underlying h/w, storage layer and the redundancy of the ASM diskgroups.

          There is no separate HA option for ASM.

          If the server h/w and o/s are up and running, ASM can run.

          If the underlying storage devices are available, the ASM diskgroups can be mounted.

          If a disk group is redundant, then a disk failure will not impact the availability of the diskgroup. The diskgroup will only be dismounted if you have failures in all fail groups of that diskgroup and ASM no longer have a single mirror set intact for the group.

          Loosing a disk in a mirrored ASM diskgroup is pretty much a non-event. The database using that diskgroup will not even be aware of it.

          Redundancy should also exist in the storage layer. For example 2 storage servers and a 2-way mirrored ASM diskgroup with a failgroup on each of these storage servers. Loose a storage layer/server - and there is redundancy.
          • 2. Re: ASM HA
            __OUTSIDER___
            Billy  Verreynne  wrote:
            ASM being available depends on the underlying h/w, storage layer and the redundancy of the ASM diskgroups.

            There is no separate HA option for ASM.

            If the server h/w and o/s are up and running, ASM can run.

            If the underlying storage devices are available, the ASM diskgroups can be mounted.

            If a disk group is redundant, then a disk failure will not impact the availability of the diskgroup. The diskgroup will only be dismounted if you have failures in all fail groups of that diskgroup and ASM no longer have a single mirror set intact for the group.

            Loosing a disk in a mirrored ASM diskgroup is pretty much a non-event. The database using that diskgroup will not even be aware of it.

            Redundancy should also exist in the storage layer. For example 2 storage servers and a 2-way mirrored ASM diskgroup with a failgroup on each of these storage servers. Loose a storage layer/server - and there is redundancy.
            Thanks Billy for reply.
            So in my scenario I have 2 failgroup and my DB is down when I loose one of failgroups.
            I want to understood is this normal reaction of Oracle DB ?.
            • 3. Re: ASM HA
              Billy~Verreynne
              __OUTSIDER___ wrote:

              So in my scenario I have 2 failgroup and my DB is down when I loose one of failgroups.
              I want to understood is this normal reaction of Oracle DB ?.
              The database does not even know about the failure - as the diskgroup is still mounted as there still remain a full mirror copy/failgroup intact.

              I've lost disks a number of times. And whenever that is limited to a single failgroup in a 2 way mirror, the database is unaffected and keeps on running. In fact, no-one will even notice this failure unless you have some kind of monitoring and notification tool to inform you of the disk failure.

              ASM will only unmount that diskgroup if both failgroups are impacted by disk failures. E.g. when you pull the Ethernet connection for Ethernet based shared storage (e.g. iscsi or netfiler or opefiler). Then the database will go down and everyone (database users) will immediately know that something is wrong.
              • 4. Re: ASM HA
                __OUTSIDER___
                Interesting fact that my ASM instance is operating normal after diskgroup crash but DB instances in 2 nodes fails.

                here log from asm

                Fri Feb 24 09:34:35 2012
                NOTE: ASM client test_2:test disconnected unexpectedly.
                NOTE: check client alert log.
                NOTE: Process state recorded in trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4533.trc
                Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_29434.trc:
                ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +TEST/test/spfiletest.ora
                ORA-17503: ksfdopn:2 Failed to open file +TEST/test/spfiletest.ora
                ORA-15001: diskgroup "TEST" does not exist or is not mounted
                ORA-15082: ASM failed to communicate with database instance
                ORA-15064: communication failure with ASM instance

                however

                *[grid@rac1 ~]$ srvctl status diskgroup -g test*
                Disk Group test is running on rac1,rac2

                *[grid@rac1 ~]$ srvctl status asm*
                ASM is running on rac1,rac2

                *[grid@rac1 ~]$ srvctl status database -d test*
                Instance test_1 is not running on node rac1


                In DB alert log nothing.
                Maybe I was wrong create diskgroup ? I believe that problem is in diskgroup not in
                database because after crash I cant run command asmcmd or asmca .


                Edited by: __OUTSIDER___ on Feb 23, 2012 9:57 PM
                • 5. Re: ASM HA
                  Billy~Verreynne
                  Should not fail. And there's something wrong with the config/setup as the Grid s/w reports the diskgroup mounted and the Oracle instance does not.

                  E.g.
                  - Disk Group test is running on rac1,rac2
                  - ORA-15001: diskgroup "TEST" does not exist or is not mounted

                  So is the diskgroup now mounted or not?

                  Have a closer look at what ASM itself says. Use sqlplus and have a look at the virtual ASM views for the current status.

                  E.g. the following shows the disks in a specific diskgroup and the availability and status of these disks.
                  select 
                    g.name, d.name, d.path, d.mount_status, d.mode_status, d.redundancy, d.failgroup 
                  from v$asm_disk d, v$asm_diskgroup g 
                  where d.group_number = g.group_number and g.name = 'TEST';
                  • 6. Re: ASM HA
                    __OUTSIDER___
                    Billy  Verreynne  wrote:
                    E.g. the following shows the disks in a specific diskgroup and the availability and status of these disks.
                    select 
                    g.name, d.name, d.path, d.mount_status, d.mode_status, d.redundancy, d.failgroup 
                    from v$asm_disk d, v$asm_diskgroup g 
                    where d.group_number = g.group_number and g.name = 'TEST';
                    Firstly I want to thank you for your help and time.

                    When I unplugged the network storage for one of failure disk group
                    I can't select from DB because of DB crash. After I start instance and output is
                    TEST DB

                    TEST DG1_ASM1 /dev/oracleasm/disks/DG1_ASM1 CACHED ONLINE UNKNOWN CONTROLLER2
                    TEST DG1_ASM0 /dev/oracleasm/disks/DG1_ASM0 CACHED ONLINE UNKNOWN CONTROLLER2
                    TEST DG0_ASM2 /dev/oracleasm/disks/DG0_ASM2 CACHED ONLINE UNKNOWN CONTROLLER1
                    TEST DG0_ASM1 /dev/oracleasm/disks/DG0_ASM1 CACHED ONLINE UNKNOWN DG0_ASM1
                    TEST DG1_ASM2 /dev/oracleasm/disks/DG1_ASM2 CACHED ONLINE UNKNOWN DG1_ASM2
                    TEST DG0_ASM0 /dev/oracleasm/disks/DG0_ASM0 CACHED ONLINE UNKNOWN DG0_ASM0
                    When network storage unplugged (for 1 failure DG )

                    *[grid@rac2 ~]$ srvctl status diskgroup -g test*
                    Disk Group test is running on rac1,rac2
                    *[grid@rac2 ~]$ srvctl status database -d test*
                    Instance test_1 is not running on node rac1
                    Instance test_2 is not running on node rac2

                    I also have 1 DB beside of test DB (if this related? )

                    *[grid@rac2 ~]$ srvctl status diskgroup -g odata*
                    Disk Group odata is running on rac1,rac2
                    *[grid@rac2 ~]$ srvctl status database -d cldb*
                    Instance CLDB_1 is running on node rac1
                    Instance CLDB_2 is running on node rac2

                    Yes you are right something is wrong here.
                    Another one strange behavior.

                    When network is unplugged I can't connect to my second DB (CLDB) with sqldeveloper
                    for short time. Only after one or two minutes I can reach to this DB.

                    This is your sql output for another DB.
                    ODATA DB

                    ODATA ODATA_0004 /dev/oracleasm/disks/ASMDISK4 CACHED ONLINE UNKNOWN ODATA_0004
                    ODATA ODATA_0003 /dev/oracleasm/disks/ASMDISK3 CACHED ONLINE UNKNOWN ODATA_0003
                    ODATA ODATA_0002 /dev/oracleasm/disks/ASMDISK2 CACHED ONLINE UNKNOWN ODATA_0002
                    ODATA ODATA_0001 /dev/oracleasm/disks/ASMDISK1 CACHED ONLINE UNKNOWN ODATA_0001
                    ODATA ODATA_0000 /dev/oracleasm/disks/ASMDISK0 CACHED ONLINE UNKNOWN ODATA_0000
                    ./thanks
                    • 7. Re: ASM HA
                      916154
                      Do you use multipath?
                      Why you don't use External Redundandcy Disk Group?
                      • 8. Re: ASM HA
                        __OUTSIDER___
                        Denis wrote:
                        Do you use multipath?
                        Why you don't use External Redundandcy Disk Group?
                        No multipath,Redundancy level is NORMAL and I don't understand why
                        asm_disk view shows that redundancy is UNKNOWN.
                        • 9. Re: ASM HA
                          916154
                          Hi!

                          Because you select from database,
                          try from asm -> v$asm_disk
                          • 10. Re: ASM HA
                            __OUTSIDER___
                            Denis wrote:
                            Hi!

                            Because you select from database,
                            try from asm -> v$asm_disk
                            Thanks Denis,

                            tell me please how can I connect to ASM instance from outside not with sqlplus.
                            • 11. Re: ASM HA
                              __OUTSIDER___
                              Denis wrote:
                              Hi!

                              Because you select from database,
                              try from asm -> v$asm_disk
                              I also tried .....

                              *[grid@rac1 ~]$ sqlplus / as sysasm*

                              SQL> select NAME,REDUNDANCY from v$asm_disk ;

                              NAME REDUNDA
                              ------------------------------ -------
                              DG1_ASM2 UNKNOWN
                              DG1_ASM1 UNKNOWN
                              DG1_ASM0 UNKNOWN
                              DG0_ASM2 UNKNOWN
                              DG0_ASM1 UNKNOWN
                              DG0_ASM0 UNKNOWN
                              ODATA_0004 UNKNOWN
                              ODATA_0003 UNKNOWN
                              ODATA_0002 UNKNOWN
                              ODATA_0001 UNKNOWN
                              ODATA_0000 UNKNOWN

                              11 rows selected.
                              • 12. Re: ASM HA
                                916154
                                :(:(:(:(:(:(:(

                                You must connect to asm instance,
                                not to database instance.

                                export ORACLE_SID = +ASM1
                                export ORACLE_HOME = ...
                                [grid@rac1 ~]$ sqlplus / as sysdba
                                • 13. Re: ASM HA
                                  916154
                                  tell me please how can I connect to ASM instance from outside not with sqlplus -> it other topic for forum, try resolve first.
                                  • 14. Re: ASM HA
                                    __OUTSIDER___
                                    Denis wrote:
                                    :(:(:(:(:(:(:(

                                    You must connect to asm instance,
                                    not to database instance.

                                    export ORACLE_SID = +ASM1
                                    export ORACLE_HOME = ...
                                    [grid@rac1 ~]$ sqlplus / as sysdba
                                    Denis please be considerate...

                                    *[grid@rac2 ~]$ env | grep ORA*
                                    ORACLE_SID=+ASM2
                                    ORACLE_HOSTNAME=rac2
                                    ORACLE_BASE=/u01/app/grid
                                    ORACLE_HOME=/u01/app/11.2.0/grid
                                    [grid@rac2 ~]$

                                    *"*
                                    *I also tried .....*

                                    *[grid@rac1 ~]$ sqlplus / as sysasm*
                                    ********
                                    *"*

                                    This means that I connected with ASM instance.
                                    You cannot connect to database instance with sysasm......

                                    Example for you.

                                    *[oracle@rac2 ~]$ env | grep ORA*
                                    ORACLE_SID=test_2
                                    ORACLE_HOSTNAME=rac2
                                    ORACLE_BASE=/u01/app/oracle
                                    ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
                                    *[oracle@rac2 ~]$ sqlplus / as sysasm*
                                    SQL*Plus: Release 11.2.0.1.0 Production on Fri Feb 24 16:19:11 2012
                                    Copyright (c) 1982, 2009, Oracle. All rights reserved.
                                    ERROR:
                                    ORA-01031: insufficient privileges
                                    Enter user-name:
                                    1 2 3 Previous Next