This discussion is archived
1 2 3 Previous Next 35 Replies Latest reply: Mar 12, 2012 8:02 AM by __OUTSIDER___ RSS

ASM HA

__OUTSIDER___ Newbie
Currently Being Moderated
Hi dear experts !

I have Grid Infrastructure 11gR2, RAC 2 node installed with ASM.(OEL 5.7 UEK)
ASM is configured with 3 diskgroup (normal redundancy ). First diskgroup contain
5 Virtual Disks and each disk is assigned to its own failure group. Also I have two failure
group each of which have 3 Virtual Disks (Oracle VM 3.0.3). Virtual Disks are connected to server via NFS.
Currently I am testing ASM behavior in situation when one of this failure group will be down.
As my ASM disks are virtual and each failure group (3 v. disks) are located in separate machine
I simulate crash the disks by dropping network interface. So after I kill connection between
Linux and storage (NFS) my DB (with 2 failure groups) has crashed. When connection
is became available Linux and ASM instance without reboot can collaborate with disks and
I manually *(srvctl start database -d test)* start DB.
My question is how can I configure (if Oracle supports of course )
ASM that DB continues to work even after one of disk group become unavailable.
Can I set up something ASM HA ?

any ideas are welcome


./thanks
  • 1. Re: ASM HA
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    ASM being available depends on the underlying h/w, storage layer and the redundancy of the ASM diskgroups.

    There is no separate HA option for ASM.

    If the server h/w and o/s are up and running, ASM can run.

    If the underlying storage devices are available, the ASM diskgroups can be mounted.

    If a disk group is redundant, then a disk failure will not impact the availability of the diskgroup. The diskgroup will only be dismounted if you have failures in all fail groups of that diskgroup and ASM no longer have a single mirror set intact for the group.

    Loosing a disk in a mirrored ASM diskgroup is pretty much a non-event. The database using that diskgroup will not even be aware of it.

    Redundancy should also exist in the storage layer. For example 2 storage servers and a 2-way mirrored ASM diskgroup with a failgroup on each of these storage servers. Loose a storage layer/server - and there is redundancy.
  • 2. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Billy  Verreynne  wrote:
    ASM being available depends on the underlying h/w, storage layer and the redundancy of the ASM diskgroups.

    There is no separate HA option for ASM.

    If the server h/w and o/s are up and running, ASM can run.

    If the underlying storage devices are available, the ASM diskgroups can be mounted.

    If a disk group is redundant, then a disk failure will not impact the availability of the diskgroup. The diskgroup will only be dismounted if you have failures in all fail groups of that diskgroup and ASM no longer have a single mirror set intact for the group.

    Loosing a disk in a mirrored ASM diskgroup is pretty much a non-event. The database using that diskgroup will not even be aware of it.

    Redundancy should also exist in the storage layer. For example 2 storage servers and a 2-way mirrored ASM diskgroup with a failgroup on each of these storage servers. Loose a storage layer/server - and there is redundancy.
    Thanks Billy for reply.
    So in my scenario I have 2 failgroup and my DB is down when I loose one of failgroups.
    I want to understood is this normal reaction of Oracle DB ?.
  • 3. Re: ASM HA
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    __OUTSIDER___ wrote:

    So in my scenario I have 2 failgroup and my DB is down when I loose one of failgroups.
    I want to understood is this normal reaction of Oracle DB ?.
    The database does not even know about the failure - as the diskgroup is still mounted as there still remain a full mirror copy/failgroup intact.

    I've lost disks a number of times. And whenever that is limited to a single failgroup in a 2 way mirror, the database is unaffected and keeps on running. In fact, no-one will even notice this failure unless you have some kind of monitoring and notification tool to inform you of the disk failure.

    ASM will only unmount that diskgroup if both failgroups are impacted by disk failures. E.g. when you pull the Ethernet connection for Ethernet based shared storage (e.g. iscsi or netfiler or opefiler). Then the database will go down and everyone (database users) will immediately know that something is wrong.
  • 4. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Interesting fact that my ASM instance is operating normal after diskgroup crash but DB instances in 2 nodes fails.

    here log from asm

    Fri Feb 24 09:34:35 2012
    NOTE: ASM client test_2:test disconnected unexpectedly.
    NOTE: check client alert log.
    NOTE: Process state recorded in trace file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_4533.trc
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_29434.trc:
    ORA-17503: ksfdopn:DGOpenFile05 Failed to open file +TEST/test/spfiletest.ora
    ORA-17503: ksfdopn:2 Failed to open file +TEST/test/spfiletest.ora
    ORA-15001: diskgroup "TEST" does not exist or is not mounted
    ORA-15082: ASM failed to communicate with database instance
    ORA-15064: communication failure with ASM instance

    however

    *[grid@rac1 ~]$ srvctl status diskgroup -g test*
    Disk Group test is running on rac1,rac2

    *[grid@rac1 ~]$ srvctl status asm*
    ASM is running on rac1,rac2

    *[grid@rac1 ~]$ srvctl status database -d test*
    Instance test_1 is not running on node rac1


    In DB alert log nothing.
    Maybe I was wrong create diskgroup ? I believe that problem is in diskgroup not in
    database because after crash I cant run command asmcmd or asmca .


    Edited by: __OUTSIDER___ on Feb 23, 2012 9:57 PM
  • 5. Re: ASM HA
    BillyVerreynne Oracle ACE
    Currently Being Moderated
    Should not fail. And there's something wrong with the config/setup as the Grid s/w reports the diskgroup mounted and the Oracle instance does not.

    E.g.
    - Disk Group test is running on rac1,rac2
    - ORA-15001: diskgroup "TEST" does not exist or is not mounted

    So is the diskgroup now mounted or not?

    Have a closer look at what ASM itself says. Use sqlplus and have a look at the virtual ASM views for the current status.

    E.g. the following shows the disks in a specific diskgroup and the availability and status of these disks.
    select 
      g.name, d.name, d.path, d.mount_status, d.mode_status, d.redundancy, d.failgroup 
    from v$asm_disk d, v$asm_diskgroup g 
    where d.group_number = g.group_number and g.name = 'TEST';
  • 6. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Billy  Verreynne  wrote:
    E.g. the following shows the disks in a specific diskgroup and the availability and status of these disks.
    select 
    g.name, d.name, d.path, d.mount_status, d.mode_status, d.redundancy, d.failgroup 
    from v$asm_disk d, v$asm_diskgroup g 
    where d.group_number = g.group_number and g.name = 'TEST';
    Firstly I want to thank you for your help and time.

    When I unplugged the network storage for one of failure disk group
    I can't select from DB because of DB crash. After I start instance and output is
    TEST DB

    TEST DG1_ASM1 /dev/oracleasm/disks/DG1_ASM1 CACHED ONLINE UNKNOWN CONTROLLER2
    TEST DG1_ASM0 /dev/oracleasm/disks/DG1_ASM0 CACHED ONLINE UNKNOWN CONTROLLER2
    TEST DG0_ASM2 /dev/oracleasm/disks/DG0_ASM2 CACHED ONLINE UNKNOWN CONTROLLER1
    TEST DG0_ASM1 /dev/oracleasm/disks/DG0_ASM1 CACHED ONLINE UNKNOWN DG0_ASM1
    TEST DG1_ASM2 /dev/oracleasm/disks/DG1_ASM2 CACHED ONLINE UNKNOWN DG1_ASM2
    TEST DG0_ASM0 /dev/oracleasm/disks/DG0_ASM0 CACHED ONLINE UNKNOWN DG0_ASM0
    When network storage unplugged (for 1 failure DG )

    *[grid@rac2 ~]$ srvctl status diskgroup -g test*
    Disk Group test is running on rac1,rac2
    *[grid@rac2 ~]$ srvctl status database -d test*
    Instance test_1 is not running on node rac1
    Instance test_2 is not running on node rac2

    I also have 1 DB beside of test DB (if this related? )

    *[grid@rac2 ~]$ srvctl status diskgroup -g odata*
    Disk Group odata is running on rac1,rac2
    *[grid@rac2 ~]$ srvctl status database -d cldb*
    Instance CLDB_1 is running on node rac1
    Instance CLDB_2 is running on node rac2

    Yes you are right something is wrong here.
    Another one strange behavior.

    When network is unplugged I can't connect to my second DB (CLDB) with sqldeveloper
    for short time. Only after one or two minutes I can reach to this DB.

    This is your sql output for another DB.
    ODATA DB

    ODATA ODATA_0004 /dev/oracleasm/disks/ASMDISK4 CACHED ONLINE UNKNOWN ODATA_0004
    ODATA ODATA_0003 /dev/oracleasm/disks/ASMDISK3 CACHED ONLINE UNKNOWN ODATA_0003
    ODATA ODATA_0002 /dev/oracleasm/disks/ASMDISK2 CACHED ONLINE UNKNOWN ODATA_0002
    ODATA ODATA_0001 /dev/oracleasm/disks/ASMDISK1 CACHED ONLINE UNKNOWN ODATA_0001
    ODATA ODATA_0000 /dev/oracleasm/disks/ASMDISK0 CACHED ONLINE UNKNOWN ODATA_0000
    ./thanks
  • 7. Re: ASM HA
    916154 Newbie
    Currently Being Moderated
    Do you use multipath?
    Why you don't use External Redundandcy Disk Group?
  • 8. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Denis wrote:
    Do you use multipath?
    Why you don't use External Redundandcy Disk Group?
    No multipath,Redundancy level is NORMAL and I don't understand why
    asm_disk view shows that redundancy is UNKNOWN.
  • 9. Re: ASM HA
    916154 Newbie
    Currently Being Moderated
    Hi!

    Because you select from database,
    try from asm -> v$asm_disk
  • 10. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Denis wrote:
    Hi!

    Because you select from database,
    try from asm -> v$asm_disk
    Thanks Denis,

    tell me please how can I connect to ASM instance from outside not with sqlplus.
  • 11. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Denis wrote:
    Hi!

    Because you select from database,
    try from asm -> v$asm_disk
    I also tried .....

    *[grid@rac1 ~]$ sqlplus / as sysasm*

    SQL> select NAME,REDUNDANCY from v$asm_disk ;

    NAME REDUNDA
    ------------------------------ -------
    DG1_ASM2 UNKNOWN
    DG1_ASM1 UNKNOWN
    DG1_ASM0 UNKNOWN
    DG0_ASM2 UNKNOWN
    DG0_ASM1 UNKNOWN
    DG0_ASM0 UNKNOWN
    ODATA_0004 UNKNOWN
    ODATA_0003 UNKNOWN
    ODATA_0002 UNKNOWN
    ODATA_0001 UNKNOWN
    ODATA_0000 UNKNOWN

    11 rows selected.
  • 12. Re: ASM HA
    916154 Newbie
    Currently Being Moderated
    :(:(:(:(:(:(:(

    You must connect to asm instance,
    not to database instance.

    export ORACLE_SID = +ASM1
    export ORACLE_HOME = ...
    [grid@rac1 ~]$ sqlplus / as sysdba
  • 13. Re: ASM HA
    916154 Newbie
    Currently Being Moderated
    tell me please how can I connect to ASM instance from outside not with sqlplus -> it other topic for forum, try resolve first.
  • 14. Re: ASM HA
    __OUTSIDER___ Newbie
    Currently Being Moderated
    Denis wrote:
    :(:(:(:(:(:(:(

    You must connect to asm instance,
    not to database instance.

    export ORACLE_SID = +ASM1
    export ORACLE_HOME = ...
    [grid@rac1 ~]$ sqlplus / as sysdba
    Denis please be considerate...

    *[grid@rac2 ~]$ env | grep ORA*
    ORACLE_SID=+ASM2
    ORACLE_HOSTNAME=rac2
    ORACLE_BASE=/u01/app/grid
    ORACLE_HOME=/u01/app/11.2.0/grid
    [grid@rac2 ~]$

    *"*
    *I also tried .....*

    *[grid@rac1 ~]$ sqlplus / as sysasm*
    ********
    *"*

    This means that I connected with ASM instance.
    You cannot connect to database instance with sysasm......

    Example for you.

    *[oracle@rac2 ~]$ env | grep ORA*
    ORACLE_SID=test_2
    ORACLE_HOSTNAME=rac2
    ORACLE_BASE=/u01/app/oracle
    ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
    *[oracle@rac2 ~]$ sqlplus / as sysasm*
    SQL*Plus: Release 11.2.0.1.0 Production on Fri Feb 24 16:19:11 2012
    Copyright (c) 1982, 2009, Oracle. All rights reserved.
    ERROR:
    ORA-01031: insufficient privileges
    Enter user-name:
1 2 3 Previous Next

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points