Database Software

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Recovery terminates on standby database after a PITR of a PDB on Primary Database in Oracle 12c

unknown-1040115Mar 3 2016 — edited Jun 23 2016

Recovery Terminates on Standby After a PITR of a PDB on Primary Database in Oracle 12c

by Shivananda Rao

This article is for DBAs who have a setup/environment on Oracle Database 12c with a Pluggable Database (PDB) plugged into the Container database. This is how to overcome a scenario where the recovery (MRP) terminates on the physical standby database after a Point In Time Recovery (PITR) of a PDB is performed on the Primary database. This is one of my favorite features I use typically to overcome such an issue.

As of Oracle 12c, Point In Time Recovery (PITR) is possible at Container Database (CDB) level and at Pluggable Database (PDB) level making things simpler. This article discusses how the physical standby database is affected after a PITR of a Pluggable Database is performed on the Primary database. A PITR is an incomplete recovery process; since the recovery is performed until a specific time at the Primary, there is data loss.

Oracle ASM has some great capabilities and advantages such as ease of management, high performance, and low overhead. It also solves many storage challenges, because it allows removing/adding disks online and it will automatically rebalance the data across the disks to avoid a bottleneck on a specific disk; thus, it provides the best performance.

The PITR requires opening the Primary database with the RESETLOGS option and starting a new incarnation of the database.

This concept of the PITR holds good for the PDB as well alongside the CDB in 12c. But let's say, we have a physical standby database configured for the primary database which has the same PDBs plugged into it. Any thoughts on what exactly happens to the physical standby database when a PITR is performed on a PDB of a primary database?

In pre-12c versions, before the invention of Multitenant databases, a PITR of the primary database affected the entire database and would require flashing back the standby to match the Primary (if possible) or lead to the re-creation of the standby if flashback was not enabled and the standby had applied redo past the SCN of the PITR.

In 12c, though, the standby database is built at CDB level, meaning that redo generation at the Primary and redo apply at the standby covers all PDBs in the CDB. This means that a PITR of a PDB on the primary database leads to the termination of the recovery (Managed Recovery Process) on the physical standby database and the errors (ORA-39874 and ORA-39873) would be reported in the alert log of the standby database.

So, how do you overcome such a scenario? I've outlined the steps to get the physical standby in sync with its primary database after a PITR was performed.

Environment

The environment comprises a 2-node RAC Primary database (SRPRIM) with PDB1 as a pluggable database plugged into it and a corresponding 2-node RAC Physical Standby database (srpstb) with the same PDB1 as a pluggable database plugged in. There has been no Data Guard Broker configured for this setup although the same procedure works fine when the Broker is configured. Point-In-Time Recovery (PITR) is also performed on PDB1 at the primary site and this is not outlined in this article. You can refer to the Database Backup and Recovery User's Guide for more details.

From the alert log file of the standby database (the instance on which managed recovery process was running), the following error message is reported indicating the termination of MRP and the recovery of the problematic PDB with the corresponding checkpoint SCN.

On the Primary, a PITR was performed on PDB1 at SCN 1957233 and the same is indicated in the alert log of the standby database.

Let's get the incarnation details of the PDB1 on the primary database.

It can been seen that the INCARNATION SCN for the CURRENT status of PDB1 is 1957233 which means that a PITR was performed on this pluggable database at SCN 1957233 and was opened with resetlogs at SCN 1996783. Now, in order to start the recovery on the standby database and get it in sync with the primary, we need to follow the following steps.

Step 1: Mount the standby database and have the problematic pluggable database in a closed state.

Step 2: Identify the backup which is equal or less than 1957233 SCN.

We know the SCN at which the PDB was recovered on primary, so identify the backup which is equal or less than 1957233 SCN. If the backups are being taken from the standby, then the standby will automatically identify the corresponding backup pieces. If not, then identify the backup pieces on the primary and transfer it to the standby site. In this example, the backup is being taken from the primary database. Hence the required backup piece (/u02/bkp/SRPRIM_inc0_20160103_0eqqekl6_1_1.bak) is identified and shipped to the standby site. From the primary site:

Step 3: Restore the pluggable database PDB1 on the standby site until the SCN (1957233) at which the PITR was performed for this PDB on the primary.

This SCN can be obtained from the alert log file of the standby at which the error ORA-39873 was reported. If the backup piece identified in "Step 2" is copied over to a different location on the standby site, then this backup piece needs to be catalogued with the standby database before restoring.

In my case, I have this file copied over to the same location on the standby site (/u02/bkp/) and hence I'm not cataloging the backuppiece. Moving on with the restore, the pluggable database is being restored until SCN 1957233 on the standby database.

Step 4: After restoring is completed, let's start the managed recovery process (MRP) on the standby database.

Data Guard will continue recovery of this PDB as well as all other PDBs in the Standby CDB as normal.

     RMAN> alter database recover managed standby database disconnect;

     Statement processed

Check the recovery status on the standby database:

It's clear that now MRP is running and waiting for the next log sequence from the primary database. If Active Data Guard license is available, then stop the recovery, open the standby and the corresponding PDBs in READ ONLY mode and restart the recovery again.

Step 5: Let's check the incarnation details of the PDB1 PDB on the standby database. It should be the same as that on the Primary.

As shown above, the standby PDB is now on the same incarnation as the Primary PDB. Alternatively, another quick option would be to flashback the standby database to the SCN that is reported in the alert log of the standby instance, provided FLASHBACK was enabled on the standby. This process removes the need of identifying the right backup, copying them from the primary site to the standby site and then restoring them at the standby site.

Here are the steps that need to be followed if FLAHSBACK is enabled for the standby database.

Identify the SCN at which the PDB PITR was performed on the primary. As said earlier, this can be obtained from the alert log of the standby database at which ORA-39873 was reported.
Flashback the standby CDB to the SCN that is determined above.

SQL> flashback database to SCN 1957233;

 3. Start the recovery on the standby and monitor the progress.

Conclusion:

A PITR is always performed on a primary database and not on the standby database. This is a good feature in 12c where a PITR of one PDB does not impact the CDB to which it is plugged in or to the other PDBs that are plugged in under the same CDB, but the standby CDB recovery is terminated completely and goes out of sync until it recovers the PDB through the resetlogs using one of the two methods discussed in this paper.

About the Author

Shivananda Rao is an Oracle ACE Associate and working as a Senior Oracle DBA. He has good knowledge on Oracle technologies specifically with High Availability, Disaster Recovery, Upgrades and RMAN. He has been actively participating in the OTN forum and maintains Oracle technical blog (www.shivanandarao-oracle.com) which has more than 50 articles published by him. He has an expertise of working on Dataguard and RMAN issues with much concentrated on High Availability topics.

GregV

Hi,
Does it fail exactly at the DBMS_DATAPUMP.open line or you're getting the error from an exception block?
Usually the "job does not exist" error is a consequence of terminating a job that hasn't been created due to another error. Can you post the code?

User_H3J7U

Datapump creates the table based on job name. Table format may deffers accross databse versions.

User_TK218

@gregv This is how the code looks the code doesn't reach the line DBMS_OUTPUT.put_line('Job created') it fails on open and moves to exception

nFlashbackSCN := timestamp_to_scn(SYSDATE);
DBMS_OUTPUT.put_line('Starting Backup process');
hdpBackupJob := DBMS_DataPump.OPEN(operation => 'EXPORT', job_mode => 'SCHEMA', remote_link => NULL, job_name => 'J$AIM_BACKUP', VERSION => 'LATEST');
DBMS_OUTPUT.put_line('Job created');
DBMS_DataPump.add_file(handle => hdpBackupJob,
filename => sDumpFileName,
DIRECTORY => sBackupDirectory,
filesize => sFileSize,
reusefile => nReuseFile);
DBMS_DataPump.add_file(handle => hdpBackupJob,
filename => sLogFileName,
DIRECTORY => sBackupDirectory,
filetype => DBMS_DataPump.KU$_FILE_TYPE_LOG_FILE,
reusefile => nReuseFile);
-- consistent export
DBMS_DataPump.set_parameter(handle => hdpBackupJob, NAME => 'FLASHBACK_SCN', VALUE => nFlashbackSCN);
IF sJobMode = 'SCHEMA'
AND sSchemaName IS NOT NULL
THEN
DBMS_DataPump.metadata_filter(handle => hdpBackupJob, NAME => 'SCHEMA_EXPR', VALUE => 'IN (' || sSchemaName || ')');
END IF;

So this is how the code looks and

GregV

Can you show the exception's handler code as well?

User_TK218

EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.put_line ( 'Error raised: '|| DBMS_UTILITY.FORMAT_ERROR_BACKTRACE || ' - '||sqlerrm);
AIM_DBS.P$Rollback;
BEGIN
DBMS_DataPump.stop_job(handle => hdpBackupJob, IMMEDIATE => 1, keep_master => 0, DELAY => 0);
AIM_DBS.P$Commit;
END;

GregV

Thanks. That's what I was trying to tell you. Here this is the stop_job instruction that produces the error, because the job doesn't exist in fact. SO to get the real error, comment out the DBMS_DataPump.stop_job line.

User_TK218

@gregv Sorry for the delay I tried commenting that line mentioning
DBMS_DataPump.stop_job(handle => hdpBackupJob, IMMEDIATE => 1, keep_master => 0, DELAY => 0);
and still getting the same error
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DATAPUMP", line 1852
ORA-06512: at "SYS.DBMS_DATAPUMP", line 6833
ORA-06512: at "AIMIM.AIM_BACKUP", line 1034

ORA-31626: job does not exist

Now the difference from the past is the line
ORA-06512: at "AIMIM.AIM_BACKUP", line 1034
exactly points to the open function.

Solomon Yakobson

Error is very misleading. DBMS_DATAPUMP.OPEN creates master table. And ORA-31626: job does not exist is raised when Oracle can't create master table:

SQL> revoke create table from u1;

Revoke succeeded.

SQL> connect u1@pdb1sol122

Enter password:

Connected.

SQL> exec DBMS_OUTPUT.PUT_LINE(DBMS_DataPump.OPEN(operation => 'EXPORT', job_mode => 'SCHEMA', remote_link => NULL, job_name => 'J$AIM_BACKUP', VERSION => 'LATEST'));

BEGIN DBMS_OUTPUT.PUT_LINE(DBMS_DataPump.OPEN(operation => 'EXPORT', job_mode => 'SCHEMA', remote_link => NULL, job_name => 'J$AIM_BACKUP', VERSION => 'LATEST')); END;

*

ERROR at line 1:

ORA-31626: job does not exist

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79

ORA-06512: at "SYS.DBMS_DATAPUMP", line 1852

ORA-06512: at "SYS.DBMS_DATAPUMP", line 6833

ORA-06512: at line 1

SQL> connect scott@pdb1sol122

Enter password:

Connected.

SQL> grant create table to u1;

Grant succeeded.

SQL> connect u1@pdb1sol122

Enter password:

Connected.

SQL> exec DBMS_OUTPUT.PUT_LINE(DBMS_DataPump.OPEN(operation => 'EXPORT', job_mode => 'SCHEMA', remote_link => NULL, job_name => 'J$AIM_BACKUP', VERSION => 'LATEST'));

PL/SQL procedure successfully completed.

SQL>

SY,

1 person found this helpful

User_TK218

@solomon-yakobson Yeah it worked for some time as you described in sqlplus mode in command prompt for later it also stopped working in command prompt itself. Also, whenever I try to run the procedure including this code as the same user from the SQL developer it is showing the same error for me. Is there something related to the pluggable database? Or because I am running the procedure in the scheduled job?

Solomon Yakobson

What is stream pool size?
SY.

User_TK218

@solomon-yakobson I checked the streams_pool_Size using the sqlplus
SQL> show parameter STREAMS_POOL_SIZE;
NAME TYPE VALUE
------------------------------------ -----------
streams_pool_size big integer 0

Solomon Yakobson

And what about SGA_TARGET? Most likely is also set to zero. If so, you need to set streams pool size to say 40M.
SY.

User_TK218

@solomon-yakobson Yes, As you said SGA_TARGET is also zero(attached below). I will try to set STREAMS_POOL_SIZE=40M and will let you know.
NAME TYPE VALUE
---------- ----------- -----
sga_target big integer 0

User_TK218

@solomon-yakobson Sorry for the late reply altering the streams_pool_size parametet, it didn't work for me either. I also granted privileges like
grant dba to AIMIM;
grant all privileges to AIMIM;
grant sysdba to AIMIM;
grant sysbackup to AIMIM;
commit;
But none of them helped me in creating the job.

Solomon Yakobson

Double-check streams pool size:

SELECT VALUE FROM GV$PARAMETER WHERE NAME = 'streams_pool_size'
/

I assume

DBMS_DataPump.OPEN(operation => 'EXPORT', job_mode => 'SCHEMA', remote_link => NULL, job_name => 'J$AIM_BACKUP', VERSION => 'LATEST');

is in stored procedure, so keep in mind roles are ignored (unless stored procedure is AUTHID CURRENT_USER). Therefore double-check stored procedure owner (not caller) is directly (not via role) granted CREATE TABLE.
Double-check stored procedure owner (not caller) has quota on its default tablespace.
Restart database. It could be background process QMNC is dead for some reason (e.g. cancelled shutdown).
SY.

1 - 15

Added on Mar 3 2016

#legacy-documents, #oracle-database-administration

1 comment

5,533 views

Database Software

Recovery terminates on standby database after a PITR of a PDB on Primary Database in Oracle 12c

Recovery Terminates on Standby After a PITR of a PDB on Primary Database in Oracle 12c

Environment

Step 1: Mount the standby database and have the problematic pluggable database in a closed state.

Step 2: Identify the backup which is equal or less than 1957233 SCN.

Step 3: Restore the pluggable database PDB1 on the standby site until the SCN (1957233) at which the PITR was performed for this PDB on the primary.

Step 4: After restoring is completed, let's start the managed recovery process (MRP) on the standby database.

Step 5: Let's check the incarnation details of the PDB1 PDB on the standby database. It should be the same as that on the Primary.

Conclusion:

About the Author

Comments

Post Details