Discussions
Categories
- 17.9K All Categories
- 3.4K Industry Applications
- 3.3K Intelligent Advisor
- 62 Insurance
- 536K On-Premises Infrastructure
- 138.2K Analytics Software
- 38.6K Application Development Software
- 5.7K Cloud Platform
- 109.4K Database Software
- 17.5K Enterprise Manager
- 8.8K Hardware
- 71.1K Infrastructure Software
- 105.2K Integration
- 41.5K Security Software
RMAN using archives for recovery when incremental level 1 is available

Dears,
Background:
We are trying to change our backup & restore strategy to use only FORCE LOGGED level 0 & level 1 (purge archive logs at all other times - the window outside of the level 0 & level 1 - which will take all non-logged archives) for DWH system which is hammered with non logged operation at all times. Our operations is aware of this and are OK with data loss (which can be reloaded). Need for such backup is, we want a DB with no non-logged corruption in case of a restore/recovery done.
DB version : 12.1.0.2
PSU level : APR 2017
Platform: Linux x86_64
Issue:
We are trying to certify our procedure to see if oracle is able to use FORCE LOGGED level 0 and level 1 successfully for the restore/recovery operation using a test restore/recovery of the whole database. But we ran into situation where 1 or few datafiles were recovered using archive files (we didnt purge archives yet, since we haven't certified the procedure) compared to incremental level 1.
Below is the command used for recovery (we didnt force 'recover database noredo' to apply incremental level 1 yet).
recover database DELETE ARCHIVELOG MAXSIZE 200G;
Out of 682 dfs, 669 are recovered using incremental level 1 backup. Rest of the 13 had archives to recover them (out of the 13, 6 are new dfs which are created after level 0 backup was kicked off, so we have no concerns for them using archives, we will find a way around for them in our procedure) - but the other 7 had some unique finding, please read the below table - notice the highlighted boxes... (coming from v$backup_datafile) - this is an example data.
Query used: [column heading may be misleading]
select *
from
(
select c.file#
,c.checkpoint_change# current_ckpt#
,c.checkpoint_time current_ckpt_time
,a.ckptscn level_0_ckpt#
,b.incscn level1_ckpt#
from v$datafile_header c
,(select file#,checkpoint_change# ckptscn from v$backup_datafile where incremental_level=0 and completion_time > to_date('27/02/2020 04:05:00')) a
,(select file#,incremental_change# incscn from v$backup_datafile where incremental_level=1 and completion_time > to_date('27/02/2020 04:05:00')) b
where b.file#=a.file#(+)
and c.file#=b.file#
)
where (level_0_ckpt# is NULL OR level_0_ckpt#!=level1_ckpt#)
order by 1;
Example Data (not actual one):
INCREMENTAL LEVEL 0 (only incremental level 0 before level 1) | INCREMENTAL LEVEL 1 (only incremental level 1 post level 0) | |||
DF# | INCREMENTAL SCN | CHECKPOINT CHANGE# | INCREMENTAL CHANGE# | CHECKPOINT CHANGE# |
1 | 0 | 10 | 10 | 20 |
2 | 0 | 11 | 11 | 21 |
3 | 0 | 13 | 15 | 23 |
4 | 0 | 10 | 12 | 24 |
5 | 0 | 11 | 11 | 25 |
Question : Is it possible to have such a change# jump for a df (same) which stayed across level 0 and level 1?
We observed this jump on all those 7 dfs, the incremental change# of level 1 > checkpoint change# of level 0!!!!
Read: https://docs.oracle.com/database/121/BRADV/rcmcncpt.htm#BRADV89498 ( About the Incremental Backup Algorithm)
Oracle in the doc says the below under Incremental start SCN
The incremental start SCN is most often the checkpoint SCN of the parent of the level 1 backup.
***most often - not 100% of the time!!!!!***
then in the next passage it says
If the backup is cumulative, then the incremental start SCN is the checkpoint SCN of the most recent level 0 backup.
Our case is cumulative for now, since we didnt have any level 1 between our level 0 or level 1. One level 0 and one level 1.
We assume at this stage, oracle is properly picking the archives as its way to move forward since there are chance we may miss few blocks in case it uses the incremental level 1 to recover the db.
So the reason for this discussion thread is
1) Is our understanding valid at this stage on
a) the way oracle's incremental backup is performed?
b) why oracle picked the archives over incremental level 1?
2) If our understanding is valid, why oracle had a jump between the lvl0.checkpoint_change# and lvl1.incremental_change#?
We need the above questions clarified to move forward with our desired backup/restore strategy.
Thank you!
Answers
-
Well, the simple answer is that, according to your given RECOVER statement, you don't specify a recovery point in time or scn. So of course the recovery will try to roll forward as far as possible. That will necessarily include all archivelogs that were created after the last available backup, and even into the current online redo logs.
-
If you backup an open database, you will always require incremental + archivelogs to perform a consistent recovery. You cannot use incremental level backups only to recover a database to a consistent state unless the backup was performed while the database was shutdown properly. RMAN will choose a recovery strategy depending on what data it has available and appears to be the most efficient for the desired recovery task.
What are you trying to accomplish, resp. what are you hoping to gain with your strange backup strategy?
-
Hi Dude,
The aim of the backup strategy is to offer a non-logged block corruption free db to the end users.
We aim to use incremental level 0 + incremental level 1 + archive generated during the incremental level 1 alone (we have set until time in our run script and the time is just the end of the incremental level 1).
The restore or recover command just works fine and it will allow db to be opened in resetlogs fine no issues (but there will non-logged block corruption).
But the concern is coming from usage of archivelogs when incremental level 1 is available, this means it is recovering the dfs using the archives generated between incremental level 0 and level 1 which we shouldnt, when we have an incremental level 1 available. Hope this clarifies.
In understanding that behavior we found what we explained in the issue description (relationship between incremental_change# of level 1 backup and checkpoint_change# of level 0 backup).
Thank you!
-
Hi Dude,
Forgot to address the gain you asked for...
We dont want to waste backup space on archives (assume we will have several TBs of archive per day generated) which will result in corrupted DB.
Hence we chose FORCE LOGGED level 0 and level 1 sticking to the archives needed only in this window (we have a backup script which handles this properly).
So you generate a lot of archive with force logging (5X more than no force logging in our case) during the level 0 and level 1 and purge the archives generated in rest all window (this compensates the space we need to store the force logged archives.
Hope it is clear now.
Please let us know in case you ever happened to notice the scenario we have explained. To us it looks incremental level 1 isnt backing up all modified blocks, if the situation explained holds true - it is a far more risky than anything else!!!!!!
Thanks
-
Dear EdStevens,
We didnt post the whole script here for recovery here. But we indeed included the time, you can see below..
run
{
allocate sbt channels ....
set until time "to_date('02-mar-2020 03:53:00','dd-mon-yyyy hh24:mi:ss')";
recover database DELETE ARCHIVELOG MAXSIZE 200G;
release channels...
}
Hope it is clear now.
Please let us know in case you ever happened to notice the scenario we have explained. To us it looks incremental level 1 isnt backing up all modified blocks, if the situation explained holds true - it is a far more risky than anything else!!!!!!
Thank you!
-
Sorry I find you last response confusing:
We dont want to waste backup space on archives (assume we will have several TBs of archive per day generated) which will result in corrupted DB.Hence we chose FORCE LOGGED level 0 and level 1 sticking to the archives needed only in this window (we have a backup script which handles this properly).
Why would TB's of archivelogs corrupt the database? Force logging mode is a requirement that belongs to Oracle DataGuard and ensures that a standby database remains consistent with the primary database. What does this have to do with RMAN?
The aim of the backup strategy is to offer a non-logged block corruption free db to the end users.We aim to use incremental level 0 + incremental level 1 + archive generated during the incremental level 1 alone (we have set until time in our run script and the time is just the end of the incremental level 1).
Are you backing up and recovering a standby database?
RMAN has no crystal ball to know what kind of recovery you desire. You can, however, delete all archivelogs prior to performing the backup, in which case you can limit your recovery strategy to the way you desire. RMAN will not backup data that doesn't exist.
Usually, if you need to save disk space, you would backup the "database plus archivelog delete input", which means archivelogs are automatically purged after backup. You can also only backup achivelogs that have not been backed up.
-
Hi Dude,
Thanks for your kind response!
We dont want to keep any archives which doesnt promise us an corruption free database, all these TB worth of archives were generated from non-logging operation (which is outside our backup window). This is the context of the statement. We secure a L0 and L1 with force logging enabled in the DB and we backup all the archives which were generated in the backup window and this works just fine - no issues. But the issue is the jump in incremental_change# for level 1 compared to checkpoint_change from level 0 - this is what we are trying to understand why and how?
We dont have any standby databases. Anyway we are looking at a corruption free database. We do a lot of sqlldr operation with nologging.
This is clear.
We are trying to save space on recovery area (disk) & backup system (tape) as well. Dont want to keep something which isnt useful.
Hope this clarifies. Intention of this post is to see why oracle had a jump between the lvl0.checkpoint_change# and lvl1.incremental_change#?
Thank you!
-
The following is from Oracle documentation:
During an incremental backup, RMAN reads the SCN of each data block in the input file and compares it to the checkpoint SCN of the parent incremental backup. If the SCN in the input data block is greater than or equal to the checkpoint SCN of the parent, then RMAN copies the block.
There are SCN and Checkpoing SCN. The SCN of a data block, from what I understand, is not necessarily in a sequential order and reflects the "current" SCN similar to the current database clock.
The RMAN recovery process will use incremental level 1 backups and archivelogs necessary to perform media recovery. This process will also rebuild the undo tablespace in order to rollback uncommitted transactions. As far as I'm aware, starting an RMAN backup will automatically archive the current redo log, which automatically performs a log switch and hence a checkpoint, provided you include the archivelogs in the backup.
If you use SQL Loader or import data without generating logs, then you can not rely on redo to recover your database. What happens if this occurs during a backup? I would have to give this some more thoughts, but right now I think your best option to guarantee a corruption free recovery might be to create the backup while the database is mounted and not open. You can use an RMAN incremental backup strategy, but you have to make sure no "nologging" happens during the backup.
Have you seen the following:
Best Practice 6: Plan for NOLOGGING Operations in Your Backup/Recovery Strategy
https://docs.oracle.com/database/121/VLDBG/GUID-42825ED1-C4C5-449B-870F-D2C8627CBF86.htm#VLDBG1578
-
Hi Dude,
I agree to the comment, this was one of the documents we referenced before devising this backup/recovery strategy.
It was difficult for us to find a backup window (outage) of 24 or so with our current backup capability (10g pipe - SBT) - best possible. There are other alternate options to perform backup (like storage level snapshot) to reduce the window of backup - but they are all expensive I believe and arent explored yet for regular backup purpose yet.
This discussions are going to happen in future in case we dont find the cause of the observation.
Sticking to the original post question to see if we can find out the cause:
Why oracle had a jump between the lvl0.checkpoint_change# and lvl1.incremental_change#?
I havent scrutinized the recovery this far until this backup strategy was arrived at and it ended up in corrupted datafiles when we opened the DB.
Thank you!
-
Are you assuming that the block SCN should be sequential and the gap is the reason for the corruption?
As I understand SCN is not necessarily sequential - it's an internal time stamp. I also don't see how L0 and L1 should be affected by nologging since it follows compares changes at the block level, unless you use block change tracking.
You cannot instruct RMAN to use L1 or archivelog for recovery. RAM will determine based on own decision what is available and what is the most economic strategy.