10 Replies Latest reply: Jan 15, 2013 12:34 AM by 984880 RSS

    showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2

    Hungry DBA
      Hi,

      Having setup like 2 Node 11.2.0.2 RAC Primary with Single Node DR on Solaris SPARC 64 bit. Preffered Instance is on Node 1 hence all user gets connected to Node1, hence lot of transacations happens on Node 1. (After DR Configuration, Initial 20 days this setup was worked perfectly, and DataGuard was in Sync.)

      Now days,

      In v$archive_processes, I m seeing almost all processes STATE are BUSY and never comes to IDLE State even after killing them. I could see my archives getting transferred on DR Server with same size as on physical. But my Current log Group Sequence is far ahead than these log sequences. (Basically ora_arc process are locking these log sequences while transferring to DR Server, bcoz if I kill those arc processes then archives shifted to DR Server, would get automatically deleted)

      SQL> select * from v$archive_processes;

      PROCESS STATUS LOG_SEQUENCE STAT
      ---------- ---------- ------------ ----
      0 ACTIVE 0 IDLE
      1 ACTIVE 3211 BUSY
      2 ACTIVE 3139 IDLE
      3 ACTIVE 3209 IDLE
      4 ACTIVE 3211 IDLE
      5 ACTIVE 3205 BUSY
      6 ACTIVE 3209 BUSY
      7 ACTIVE 3206 BUSY
      8 ACTIVE 3208 BUSY
      9 ACTIVE 3211 IDLE
      10 ACTIVE 3207 BUSY

      PROCESS STATUS LOG_SEQUENCE STAT
      ---------- ---------- ------------ ----
      11 ACTIVE 3210 BUSY
      12 STOPPED 0 IDLE
      13 STOPPED 0 IDLE


      SQL> archive log list
      Database log mode Archive Mode
      Automatic archival Enabled
      Archive destination /Arch
      Oldest online log sequence 3611
      Next log sequence to archive 3615
      Current log sequence 3615


      SQL> select database_role,switchover_status from v$database;

      DATABASE_ROLE SWITCHOVER_STATUS
      ---------------- --------------------
      PRIMARY UNRESOLVABLE GAP (We used to see SESSIONS ACTIVE or TO STANDBY prior to 11.2)


      SQL> select PROCESS,STATUS,THREAD#,SEQUENCE# from v$managed_standby;

      PROCESS STATUS THREAD# SEQUENCE#
      --------- ------------ ---------- ----------
      ARCH CLOSING 1 3199
      ARCH OPENING 1 3139
      ARCH CONNECTED 0 0
      ARCH CONNECTED 0 0
      ARCH WRITING 1 3205
      ARCH WRITING 1 3209
      ARCH WRITING 1 3206
      ARCH WRITING 1 3208
      ARCH CONNECTED 0 0
      ARCH WRITING 1 3207
      LNS WRITING 1 3215

      PROCESS STATUS THREAD# SEQUENCE#
      --------- ------------ ---------- ----------
      ARCH WRITING 1 3210


      SQL> select distinct error from v$archive_Dest_status;

      ERROR
      -----------------------------------------------------------------


      (This doesn't show any error on both Nodes)


      SQL> show parameter log_archive

      NAME TYPE VALUE
      ------------------------------------ ----------- ------------------------------
      log_archive_dest_2 string SERVICE=TMSDB_DRSCAN LGWR ASYN
      C NOAFFIRM valid_for=(online_l
      ogfiles,primary_role) db_uniqu
      e_name=tmsdbdr

      log_archive_max_processes integer 12

      (Even tried to put log_archive_dest_2 ARCH ASYNC, with recover managed standby database disconnect from session, but wasnt successful)

      On DR Side
      =============
      SQL> alter database recover managed standby database using current logfile disconnect; (doing Recovery using SRL)

      SQL> select distinct recovery_mode from v$archive_Dest_status;

      RECOVERY_MODE
      -----------------------
      IDLE
      MANAGED REAL TIME APPLY


      SQL> select PROTECTION_MODE from v$database;

      PROTECTION_MODE
      --------------------
      MAXIMUM PERFORMANCE


      Am i really missing something, while Configuring DR on 11.2.0.2 Version ???


      Your responses would be appreciated to getting out of this issue...

      Regards,
      Manish
        • 1. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
          mseberg
          Hello;

          Oracle Doc E10820-02 has a small note on this but I find it vague at best. Something about a redo gap that cannot be automatically resolved, does this mean your standby is bad? Or does it mean you have to manually fix it (hopefully).

          Does it give an ORA-16724?

          If you are using Data Broker the StatusReport may provide more information.

          show database 'PRIMARY' 'InconsistentProperties';

          I wish you luck.

          Best Regards

          mseberg
          • 2. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
            856218
            Hi,

            Here I am not using DataGuard Broker, rather managing through SQL* Plus only. Not getting any ORA-16724 error. The fact is ora_arc processes are not releasing lock on log sequences due to which even transferred archives (showing same byte size as Primary, log_archive_dest_1) at DR giving error while recovering manually, like 'archive in progress'

            The defination given in Oracle Doc E10820-02 for UNRESOLVABLE GAP - Destination has a redo gap that cannot be automatically resolved by fetching the missing redo from this
            database and there are no other destinations from which redo can be fetched.

            The above statement is right, because if i kill those ora_arc processes and manually transfer the archives to DR Server with manual recover (recover standby database) then it works.

            is there any other option I could include in log_archive_dest_2 ??


            Regards,
            Manish
            • 3. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
              mseberg
              That helps, can you post your current log_archive_dest_2 setting?

              Best Regards

              mseberg
              • 4. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                856218
                Hi,

                It's already mentioned in the 1st Post....

                Thanks for showing interest....

                Regards,
                Manish
                • 5. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                  mseberg
                  I see it. The line wrap tricked me.

                  log_archive_dest_2 string SERVICE=TMSDB_DRSCAN LGWR ASYNC NOAFFIRM valid_for=(online_logfiles,primary_role) db_unique_name=tmsdbdr

                  log_archive_dest_2='SERVICE=STANDBY LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=STANDBY'

                  I compared it to my working copy and I don't think this is an issue. The missing single quote jumps out, but that is it.

                  I think UNRESOLVABLE GAP is a status new to 11.2.0.2

                  mseberg
                  • 6. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                    Hungry DBA
                    Hi msberg,

                    It's RAC Dataguard with RAC Production. There is no any issues while transferring archives from 2nd node of the Production (i.e IDLE state in v$archive_processes for all log sequences, means ora_arc process is releasing lock quickly, once archives transferred to DR Server). But on 1st Node, whenever I check, every time 10 out of 12 ora_arc process seeing BUSY State for older archives too in v$archive_processes. That means ora_arc processes don't release lock quickly, I don't understandby why this is happening ???.

                    Killing ora_arc processes OR Deferring log_archive_Dest_state_2 brings ora_arc processes state from BUSY to IDLE, but doing this automatically remove the transferred archives at DR Server. (also checked for Network bandwidth which is sufficient)

                    any clue ????


                    Regards,
                    Manish
                    • 7. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                      mseberg
                      Hello;

                      The main clue I have is I believe somebody else had the same issue here (without the UNRESOLVABLE GAP message) in the last 90 days here on OTN. I'm happy to search for it in the morning.

                      mseberg
                      • 8. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                        Dr. Paranoid-Oracle
                        You are running into some kind of hanging at the Primary for those arch processes. Not sure why though. When you kill the ARCH process and the archive log you see on disk at the standby disappears that means that the ARCH was resolving a gap and when you killed the ARCH process at the standby the partner RFS process at the Standby saw the disconnect and deleted the partial archive log at the standby. Are you getting any other messages in the alert logs?

                        Larry
                        • 9. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                          Hungry DBA
                          Thanks larry & mseberg for your support.

                          Issue got resolved now. Client had done some changes in the network setting due to which there was an issue with listener on 1st Node. Successfully restarting of Listener on Node1, solved the problem.

                          I used to check "select distinct error from v$archive_dest_status" for any issue related with archives shipping. Suprisingly this time I didn't get any error on both Nodes for the mentioned command & not in the alert log, which has taken significant amount of time in troubleshooting. Users also never complained about the connectivity as 11gR2 uses SCAN Listener for the Connection.


                          Anyway, thanks for the continous support once again.

                          Regards,
                          Manish
                          • 10. Re: showing swithover_status as UNRESOLVABLE GAP in 11.2.0.2
                            984880
                            We have similar issue where dataguard report "unresolvable gap" whereas there is no gap. If we switch log file on primay database, the broker status gets corrected automatically.
                            This happens when there is no load on database.
                            We are using two node RAC setup and same with standby database. Oracle version 11.2.0.3
                            Appreciate your help.

                            Error message:
                            DGMGRL> show configuration;

                            Configuration - ssc_dg_conf

                            Protection Mode: MaxPerformance
                            Databases:
                            SSC_PRI - Primary database
                            Error: ORA-16724: cannot resolve gap for one or more standby databases

                            SSC_SBY1 - (*) Physical standby database

                            Fast-Start Failover: ENABLED

                            Configuration Status:
                            ERROR

                            DGMGRL> exit

                            Regards,
                            Amit