5 Replies Latest reply on Jun 7, 2011 7:34 AM by CKPT

    archive not applying

    EdStevens
      Setup: - Oracle 10.2.0.4 EE on HP-UX 11.23

      Two weeks ago we got our DG physical standby setup working, initial tests looked good, I could tail both primary and standby alert logs and see results of log switches, everything looked good for about a week. Then ..

      One morning I came in and logs weren't shipping. Checked the standby site and it had lost its mounts to the NAS. So far out setup was more proof of concept and I was getting ready to take a week off. So on the primary I set log arch dest 2 status to DEFER. In my absence our SA fixed the problem with mounting the fs and restarted the standby db (he's not a dba and just did a normal startup). I returned to the office this morning and find dest2 status back to ENABLE. But it appears that archivelogs are not applying.

      Prior to the crash, as redo was received at the standy, its alert would show
      Wed May 25 18:38:11 2011
      Primary database is in MAXIMUM PERFORMANCE mode
      RFS[39]: Successfully opened standby log 5: '/oradata/ora_redo/BOSTON/sbyredo05a.rdo'
      Wed May 25 18:38:17 2011
      Media Recovery Waiting for thread 1 sequence 3123 (in transit)
      Wed May 25 18:38:17 2011
      Recovery of Online Redo Log: Thread 1 Group 5 Seq 3123 Reading mem 0
        Mem# 0: /oradata/ora_redo/BOSTON/sbyredo05a.rdo
        Mem# 1: /archive/ora_redo/BOSTON/sbyredo05b.rdo
      After the standby was restarted, I don't see the above sequence, but instead get
      Fri Jun  3 07:27:09 2011
      Primary database is in MAXIMUM PERFORMANCE mode
      RFS[28]: Successfully opened standby log 4: '/oradata/ora_redo/BOSTON/sbyredo04a.rdo'
      Fri Jun  3 07:28:49 2011
      Primary database is in MAXIMUM PERFORMANCE mode
      RFS[28]: Successfully opened standby log 5: '/oradata/ora_redo/BOSTON/sbyredo05a.rdo'
      Fri Jun  3 07:31:15 2011
      Primary database is in MAXIMUM PERFORMANCE mode
      RFS[28]: Successfully opened standby log 4: '/oradata/ora_redo/BOSTON/sbyredo04a.rdo'
      Fri Jun  3 08:39:47 2011
      Primary database is in MAXIMUM PERFORMANCE mode
      RFS[28]: Successfully opened standby log 5: '/oradata/ora_redo/BOSTON/sbyredo05a.rdo'
      It appears that I have exceeded control_file_record_keep_time. This showed up in the standby alert
      Mon Jun  6 08:25:47 2011
      FAL[client]: Failed to request gap sequence 
       GAP - thread 1 sequence 3126-3224
       DBID 2542058214 branch 737468006
      FAL[client]: All defined FAL servers have been attempted.
      -------------------------------------------------------------
      Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
      parameter is defined to a value that is sufficiently large
      enough to maintain adequate log switch information to resolve
      archivelog gaps.
      -------------------------------------------------------------
      On primary:
      SQL> show parameter CONTROL_FILE_RECORD_KEEP_TIME
      
      NAME                                 TYPE        VALUE
      ------------------------------------ ----------- ------------------------------
      control_file_record_keep_time        integer     7
      SQL>
      I could take the "easy" way out and rebuild the standby from a fresh backup of primary, but before doing that would like to learn what I can from this, as I'm still pretty new to managaing a DB setup.
        • 1. Re: archive not applying
          BPeaslandDBA
          The message about the CONTROLFILE_RECORD_KEEP_TIME parameter is often a red herring. The real culprit is:

          GAP - thread 1 sequence 3126-3224

          You are missing some log files. Find the log files from thread 1 for those sequences and manually transmit them to your standby database. Then register each one with the following:

          ALTER DATABASE REGISTER LOGFILE '/dir/filename';

          Once you have registered all logfiles with the standby, life should resume as normal.

          HTH,
          Brian
          • 2. Re: archive not applying
            EdStevens
            BPeasland wrote:
            The message about the CONTROLFILE_RECORD_KEEP_TIME parameter is often a red herring. The real culprit is:

            GAP - thread 1 sequence 3126-3224

            You are missing some log files. Find the log files from thread 1 for those sequences and manually transmit them to your standby database. Then register each one with the following:

            ALTER DATABASE REGISTER LOGFILE '/dir/filename';

            Once you have registered all logfiles with the standby, life should resume as normal.

            HTH,
            Brian
            Brian,

            This just became a very interesting little project!

            The missing logs are no longer available on the primary, as they have already been backed up then deleted. And the backupset(s) they fell into have since been marked obsolete and deleted. I do have copies of those backupset on on tape, taken from a system level backup. I'm thinking I should be able to restore the backupsets, catalog them back to rman, restore the missing logs, then continue as you suggested. It's been a long day (unrelated issues) and this is not pressing, so I think I'll wait until tomorrow. In the mean time, do you see anything wrong with my plan? (And I will press again for more disk so as to extend my backup retention policy.)
            • 3. Re: archive not applying
              BPeaslandDBA
              That looks like a plan. If for some reason you cannot get the logs to resolve the archive gap, then you will probably have to recreate your standby database.
              Cheers,
              Brian
              • 4. Re: archive not applying
                Hemant K Chitale
                In my absence our SA fixed the problem with mounting the fs and restarted the standby db (he's not a dba and just did a normal startup).
                A "normal startup" would have caused an OPEN READ ONLY and Oracle wouldn't be applying logs at the standby.


                Hemant K Chitale
                • 5. Re: archive not applying
                  CKPT
                  Mon Jun 6 08:25:47 2011
                  FAL[client]: Failed to request gap sequence
                  GAP - thread 1 sequence 3126-3224
                  DBID 2542058214 branch 737468006
                  FAL[client]: All defined FAL servers have been attempted.
                  -------------------------------------------------------------
                  Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
                  parameter is defined to a value that is sufficiently large
                  enough to maintain adequate log switch information to resolve
                  archivelog gaps.
                  -------------------------------------------------------------
                  Actually these information from alert logfile regarding CONTROL_FILE_RECORD_KEEP_TIME it shows any Archive gap found on standby even if it set to *30* days. It is very generic information, Nothing to worry,

                  Fri Jun 3 07:27:09 2011
                  Primary database is in MAXIMUM PERFORMANCE mode
                  <RFS[28]: Successfully opened standby log 4: '/oradata/ora_redo/BOSTON/sbyredo04a.rdo'

                  In above information, Whenever a archive ship from primary to standby it will assigned to logfile on standby. This is also very normal behaviour.

                  I could take the "easy" way out and rebuild the standby from a fresh backup of primary, but before doing that would like to learn what I can from this, as I'm still
                  pretty new to managaing a DB setup.

                  According to your archive log gap GAP - thread 1 sequence 3126-3224 it is not an big, Why you want to rebuild.?

                  Check what is the errors on primary alert log file.

                  Post from PRIMARY:-
                  show parameter dest_state_2
                  select ds.dest_id id
                  , ad.status
                  , ds.database_mode db_mode
                  , ad.archiver type
                  , ds.recovery_mode
                  , ds.protection_mode
                  , ds.standby_logfile_count "SRLs"
                  , ds.standby_logfile_active active
                  , ds.archived_seq#
                  from v$archive_dest_status ds
                  , v$archive_dest ad
                  where ds.dest_id = ad.dest_id
                  and ad.status != 'INACTIVE'
                  order by
                  ds.dest_id
                  /
                  select error_code,timestamp, message from v$dataguard_status where dest_id=2;
                  Thanks.