7 Replies Latest reply: Jun 5, 2013 3:58 PM by Daryl E. RSS

    Backup status monitoring

    906815
      EM-GC 11.1 on Linux
      Database 11.2 on Solaris

      Hello,

      It might be a long shot, but I hope somebody could suggest a solution:

      I have created "Backups status" rule in EM GC 11.1 as based on UDM - user defined metrics (in repository).

      (User-Defined String Metric     All Objects (Metric ID;Key)     Critical )

      Backup status     STRING     
           
           MATCH     
           FAILED

      The UDM are created for EM repository and quering the

      mgmt$ha_backup

      as

      select DATABASE_NAME, status from mgmt$ha_backup where status='FAILED';

      ---------------------------------------------------------------------

      There is one particular database we receive email notifications every day that the backup has FAILED.

      In the reality the backup status - in EM is COMPLETED

      in repository:

      SQL> select DATABASE_NAME, status from mgmt$ha_backup where status='FAILED';

      no rows selected
      ------

      (It looks to me that at some point of time during the backup, somehow the status in repository is FAILED and this is used for notification.)



      I'd appreciate any suggestions on this matter.

      Thx,
        • 1. Re: Backup status monitoring
          EricvdS
          In mgmt$ha_backup you will only see one entry per database. So if you have a backup job that consist out of two rman jobs, in your case, the last one succeeds and the first one fails. Check the backup history to see all entries to find out which one failes.

          Eric

          Edited by: EricvdS on 14-mrt-2013 21:34
          • 2. Re: Backup status monitoring
            906815
            Eric,
            I was hoping you'd reply.
            Thx,

            ----------------------------------------------
             select operation, status, start_time, end_time from V$RMAN_STATUS
            where start_time >= trunc(sysdate - 1)
            ;  2    3
            
            OPERATION                         STATUS                  START_TIM END_TIME
            --------------------------------- ----------------------- --------- ---------
            BACKUP                            COMPLETED               13-MAR-13 13-MAR-13
            BACKUP                            COMPLETED               13-MAR-13 13-MAR-13
            RMAN                              COMPLETED               13-MAR-13 13-MAR-13
            BACKUP                            COMPLETED               13-MAR-13 14-MAR-13
            REPORT                            COMPLETED               13-MAR-13 13-MAR-13
            DELETE                            COMPLETED               13-MAR-13 13-MAR-13
            DELETE                            COMPLETED WITH WARNINGS 13-MAR-13 13-MAR-13
            DELETE                            COMPLETED               13-MAR-13 13-MAR-13
            DELETE                            COMPLETED               14-MAR-13 14-MAR-13
            REPORT                            COMPLETED               13-MAR-13 13-MAR-13
            RMAN                              COMPLETED               14-MAR-13 14-MAR-13
            BACKUP                            COMPLETED               13-MAR-13 13-MAR-13
            DELETE                            COMPLETED WITH WARNINGS 13-MAR-13 13-MAR-13
            DELETE                            COMPLETED WITH WARNINGS 14-MAR-13 14-MAR-13
            REPORT                            COMPLETED               14-MAR-13 14-MAR-13
            DELETE                            COMPLETED WITH WARNINGS 14-MAR-13 14-MAR-13
            REPORT                            COMPLETED               13-MAR-13 13-MAR-13
            REPORT                            COMPLETED               14-MAR-13 14-MAR-13
            BACKUP                            COMPLETED               13-MAR-13 13-MAR-13
            DELETE                            COMPLETED               14-MAR-13 14-MAR-13
            DELETE                            COMPLETED WITH WARNINGS 13-MAR-13 13-MAR-13
            RMAN                              COMPLETED               13-MAR-13 14-MAR-13
            BACKUP                            COMPLETED               14-MAR-13 14-MAR-13
            BACKUP                            COMPLETED               14-MAR-13 14-MAR-13
            REPORT                            COMPLETED               13-MAR-13 13-MAR-13
            REPORT                            COMPLETED               14-MAR-13 14-MAR-13
            Not a trace of failed operations?


            Edited by: 903812 on Mar 14, 2013 1:46 PM

            Edited by: 903812 on Mar 14, 2013 1:48 PM
            • 3. Re: Backup status monitoring
              EricvdS
              Strange.
              And if you check the status history using the Database home page, the tab Availability, (Manage) Backup Reports?

              Eric
              • 4. Re: Backup status monitoring
                906815
                In EM backup status (as every day in the morning)
                STATUS IS ;
                COMPLETED
                ---------------------------

                Yesterday I changed in UDM :

                Consecutive Occurrences Preceding Notification     8 (was 5).

                There were no FAILED notifications last night.

                I followed your hint that something in the middle of the backup gets a FAILED status that triggers the alarm.

                Perhaps with less than 8 occurrences.

                Will live it like tis for the weekend.

                Thx Eric,
                • 5. Re: Backup status monitoring
                  906815
                  An update:

                  Still receiving "FAILED" backup notifications.

                  Changed the -----

                  Consecutive Occurrences Preceding Notification 10 (was 8).

                  Changed the job schedule from 30min to 60min.
                  ---------------------------------------------------------------------

                  BTW - is there a another way to monitor the backup status and generate alerts?

                  Thx,
                  • 6. Re: Backup status monitoring
                    SAML.
                    I setup RMAN catalog in OEM database and run a report from the catalog.

                    SAM L.
                    • 7. Re: Backup status monitoring
                      Daryl E.
                      Thought I would add my UDM here for others ..
                      We use 2 UDMs one for FULL and one for any backup at all.
                      This catches the case where backups are not firing at all .. no point showing the last backup was COMPLETED if it was weeks ago.

                      select nvl( min(trunc((sysdate-end_time)*24)),999) hrs from V$RMAN_BACKUP_JOB_DETAILS where status = 'COMPLETED' and end_time > sysdate-15

                      select nvl( min(trunc((sysdate-end_time)*24)),999) hrs from V$RMAN_BACKUP_JOB_DETAILS where status = 'COMPLETED' and end_time > sysdate-15 and input_type != 'ARCHIVELOG'

                      Then we trigger on # of hours .. so perhaps > 48 hours for no archive backups might be a good threshold. Its up to you.
                      In theory you dont need to worry about every full being successful - of course there are reprocusions, but when you have 300 databases - you cant jump on every little glitch.

                      Hope it helps others.

                      Daryl.