By Michael Rainey
The release of Oracle Business Intelligence Applications 11g included a major change in components, with Oracle Data Integrator 11g (ODI) replacing Informatica as the ETL technology. The next logical step was to integrate Oracle’s data replication tool, Oracle GoldenGate 11g (OGG), for a real-time load of source system data to the data warehouse. Using GoldenGate replication rather than a conventional ODI extract process, contention on the source is essentially eliminated and the source OLTP data is stored locally on the data warehouse, removing network bottlenecks and allowing ETL performance to increase. In this article, we'll walk through the architecture and setup for using GoldenGate 11g with Oracle BI Applications 126.96.36.199.1. Note: This article will also apply to Oracle Business Intelligence Applications versions 188.8.131.52.1 and 184.108.40.206.1.
GoldenGate and the Source Dependent Data Store
If you are not familiar with Oracle GoldenGate (OGG), it is the standard Oracle product for data replication, providing log-based change data capture, distribution, and delivery in real-time.
GoldenGate captures transactional data changes from the source database redo log and loads the changes into its own log file, called a Trail File, using a platform-independent universal data format. The Extract process understands the schemas and tables from which to capture changes based on the configuration set in the Extract parameter file. The data is then read from the Source Trail File and moved across the network to the Target Trail File using a process called a Data Pump, also driven by a parameter file. Finally, the transactions are loaded into the target database tables using the Replicat parameter file configuration, which maps source tables and columns to their target. The entire process occurs with sub-second latency and minimal impact to the source and target systems.
In Oracle BI Applications, GoldenGate 11g is used for pure replication from the source database to the target data warehouse, into what is known as the Source Dependent Data Store (SDS) schema.
The SDS is setup to look exactly like the source schema, allowing the Oracle Data Integrator pre-built Interfaces to change which source they are using from within the Loading Knowledge Module (LKM) by evaluating a variable (IS_SDS_DEPLOYED) at various points throughout the LKM. Using this approach, the GoldenGate integration can be easily enabled at any point, even after initial configuration. The Oracle BI Applications team did a great job of utilizing the features of ODI that allow the logical layer to be abstracted from the physical layer and data source connection.
User and Schema Setup
The first step is to manually create the GoldenGate user on the source and target databases. These users, like the Source Dependent Data Store schema, are not created by the Oracle BI Applications installer. This will be a dedicated user for OGG, and will have privileges specific to the needs of the extract process on the source and the replicat process on the target.
Create Source GoldenGate User
Beginning with the source, create the user and grant the initial privileges. Be sure the tablespace has already been created.
-- Create OGG User on the source
CREATE USER ogg_user IDENTIFIED BY Password01
DEFAULT TABLESPACE ggs_data QUOTA UNLIMITED ON ggs_data;
GRANT CREATE SESSION TO ogg_user;
GRANT ALTER SESSION TO ogg_user;
GRANT SELECT ANY DICTIONARY TO ogg_user;
GRANT FLASHBACK ANY TABLE TO ogg_user;
The specific table grants will be made later using a script generated by an ODI Procedure. The user will temporarily need ALTER ANY TABLE in order to set up supplemental logging for individual tables. Afterwards, this privilege can be revoked.
GRANT ALTER ANY TABLE TO ogg_user;
Finally, set up supplemental logging at the database level, ensuring the necessary information is logged for each transaction.
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
Create Target GoldenGate User
Next, create the GoldenGate user on the target database with target-specific privileges. GoldenGate performs DML on the target tables based on the change made in the source database, so the user will need to be granted privileges to INSERT, UPDATE, DELETE. Again, rather than grant INSERT ANY TABLE, etc., the specific table grants will be generated as a script via an ODI Procedure.
-- Create OGG User CREATE USER ogg_target IDENTIFIED BY Password01
DEFAULT TABLESPACE USERS QUOTA UNLIMITED ON USERS;
GRANT CREATE SESSION TO ogg_target;
GRANT ALTER SESSION TO ogg_target;
GRANT SELECT ANY DICTIONARY TO ogg_target;
The checkpoint table will be created from within GoldenGate, so this user will temporarily need to be granted the CREATE TABLE privilege. The checkpoint table will keep track of the latest position in the target trail file, allowing a clean recovery should the target database go offline or some other failure occur.
GRANT CREATE TABLE TO ogg_target;
Create SDS User
A separate SDS schema must be created for each OLTP source application, with the SDS schema essentially acting as the source schema. We'll follow the recommended naming conventions for the schema: <BIAPPS>SDS<Model Code>_<DSN>. BIAPPS is the user-defined code signifying this is a BI Applications schema. To keep it simple, we'll use BIAPPS. The Model Code is the unique code assigned to the data source and the DSN is the data source number for that source application.
In this example using Peoplesoft Campus Solutions, the SDS schema name is BIAPPS_SDS_PSFT_90_CS_20.
-- Create tablespace.
CREATE TABLESPACE BIAPPS_SDS_PSFT_90_CS_20_TS
DATAFILE '/u01/app/oracle/oradata/orcldata/BIAPPS_SDS_PSFT_90_CS_20..dbf' SIZE 100M AUTOEXTEND ON NEXT 10M
DEFAULT COMPRESS FOR OLTP;
-- Create SDS User
CREATE USER BIAPPS_SDS_PSFT_90_CS_20 IDENTIFIED BY Password01
DEFAULT TABLESPACE BIAPPS_SDS_PSFT_90_CS_20_TS QUOTA UNLIMITED ON BIAPPS_SDS_PSFT_90_CS_20_TS;
-- Required Grants
GRANT CREATE SESSION TO BIAPPS_SDS_PSFT_90_CS_20;
GRANT CREATE TABLE TO BIAPPS_SDS_PSFT_90_CS_20;
Finally, the GoldenGate target user must be granted access to use the SDS tablespace for inserts/updates.
-- OGG user must be granted Quota to insert and update data
ALTER USER ogg_target QUOTA UNLIMITED ON BIAPPS_SDS_PSFT_90_CS_20_TS;
Install and Configure GoldenGate 11g
The schemas are in place, so the next part of the setup is to install and configure the GoldenGate application on both the source and target servers. The GoldenGate 11g installation process is well documented, so we don't need to go into much detail here. The Oracle BI Applications documentation also has some example scripts, which walk through the setup of the extract, data pump, and replicat group processes.
The naming standards for the parameter files are fairly straightforward, with DSN being the same data source number we used in the SDS schema name.
- Extract: EXT_<DSN>
- Data Pump: DP_<DSN>
- Replicat: REP_<DSN>
Following the Oracle BI Applications documentation examples, we end up with each process group setup and the checkpoint table created in the target GoldenGate schema. To simplify the setup process, create an obey (.oby) file for both the source and target setup scripts, similar to the following example.
--stop manager on target
db dblogin userid ogg_target, password Password01
--stop gg processes
DELETE CHECKPOINTTABLE ogg_target.OGGCKPT
--delete previous trail files
SHELL rm ./dirdat/*
--start manager on target db
--create CHECKPOINTTABLE in target db
dblogin userid ogg_target, password Password01
ADD CHECKPOINTTABLE ogg_target.OGGCKPT
add replicat rep_20, exttrail ./dirdat/tr, CHECKPOINTTABLE ogg_target.OGGCKPT
Using an obey script allows the process to be easily rerun should there be any sort of issue or failure. It also provides a template that can be used for additional sources and SDS targets. The result should be GoldenGate process groups setup and ready to go once the parameter files are in place.
Remember to revoke the CREATE TABLE privilege from the target GoldenGate user once the checkpoint table has been created.
REVOKE CREATE TABLE FROM ogg_target;
Enable the SDS
Set the SDS Parameter
The first step in configuration is to enable the SDS in the Oracle BI Applications Configuration Manager. This indicator, set for each source instance, will be used throughout the Oracle Data Integrator ETL processing to determine whether to connect directly to the source or to the SDS schema.
Choose "Manage Data Load Parameters" from the Tasks list. In the Manage Data Load Parameters window, select the appropriate Source Instance (in this example we are using PeopleSoft Campus Solutions - so PS_CS is chosen) and search for the Parameter name IS_SDS_DEPLOYED.
Select the parameter and click the Edit button. There will be a warning that this parameter will be set globally for this Source Instance. Since we plan to pull all of our data from Campus Solutions using OGG, click Yes to continue (otherwise, you could choose a specific fact group on which to set the parameter). In the Edit Dialog, change the parameter value to YES.
Create the SDS Physical Schema
The Physical Schema for the SDS schema must be manually created in the Oracle Data Integrator Topology metadata. Open ODI Studio and select the Topology Navigator. Under Physical Architecture, drill-down into Oracle and find the Data Server that corresponds to the data warehouse target for Oracle BI Applications. Right-click and choose "New Physical Schema". Set both the Schema and Work Schema values to the SDS database schema that was created previously.
Next, click on the FlexFields tab. We need to set the DATASOURCE_NUM_ID FlexField to match that of the source instance for which the SDS schema is a target.
Generate and Execute SDS DDL Scripts
Oracle BI Applications is now configured to use the SDS schema as the source for the SDE processes, rather than the actual source, and GoldenGate is configured to keep the SDS schema up-to-date with the latest source transactions in real-time. Next, we will execute an ODI Procedure to generate the DDL scripts for the SDS schema tables and indexes, as well as the grants on the source tables for the GoldenGate user.
To execute the Scenario, open ODI Studio, go to the Designer Navigator, and in the Projects browse to BI Apps Project > Components > SDS > Oracle > Generate SDS DDL. Expand Packages > Generate SDS DDL > Scenarios and you will find the Scenario "GENERATE_SDS_DDL Version 001".
Right-click the Scenario and select Execute. After selecting the Context and Agent, you will be presented with a list of ODI Variables used in the Scenario. These will provide input into the script generation Procedure and are described in detail in section 5.4.4 Setup Step: Generate, Deploy, and Populate the Source Dependent Schema Tables on Target Database of the Oracle BI Applications documentation.
When generating the DDL script for the first time, choose "FULL" for the REFRESH_MODE variable. This will attempt to perform a drop and recreate of all objects. Also, if you would like to run the scripts manually, set the CREATE_SCRIPT_FILE equal to "Y" and RUN_DDL to "N". If preferred, the Scenario execution can run the script automatically by setting RUN_DDL equal to "Y”. After setting all variables appropriately, click OK to execute the Scenario. Browse to the temp directory specified to review the different scripts:
- BIA_SDS_Schema_Table_DDL_<session_no>.sql and BIA_SDS_Schema_Index_DDL_<session_no>.sql - When executed against the SDS schema will create the SDS tables and apply indexes
- BIA_SDS_Schema_Source_Grants_DDL_<session_no>.sql - When executed against the source schema will apply table specific grants to the source GoldenGate user
Generate GoldenGate Parameter Files and Perform Configuration
Generate Parameter Files
Now it’s time to generate the parameter files for the GoldenGate extract, pump, and replicat process groups that were setup earlier. Here is where we might expect to see the "JKM Oracle to Oracle Consistent (OGG)" Journalizing Knowledge Module and the ODI CDC Framework put to use...but, that is not the case. For starters, the CDC Framework is not a part of the Oracle BI Applications implementation for change data capture. Oracle BI Applications product developers have implemented CDC by comparing the "Last Extract Date" to the "Last Update Date" for new and changed records, always pulling from the fully replicated table in the SDS. Next, they created an ODI Procedure to generate the parameter files rather than using a JKM.
Just as we did with the Generate DDL Script Scenario, browse to BI Apps Project > Components > SDS > Generate SDS OGG Param Files. Expand Packages > Generate SDS OGG Param Files > Scenarios and find the Scenario "GENERATE_SDS_OGG_PARAM_FILES Version 001". Right-click and select Execute. We are again provided with a list of Variables that need the appropriate values added prior to starting the process. Many of them should look familiar if you have worked with the GoldenGate JKM - but with much more intuitive names.
Set the appropriate values for the variables, described in detail in section “5.4.5 Setup Step: Generate and Deploy Oracle GoldenGate Parameter Files to Source and Target Machines” in the Oracle BI Applications documentation, and click OK to execute the Scenario. Once the Scenario has completed, browse to the temporary directory and you should find a folder DSN_<DATASOURCE_NUM_ID>. Within this directory is a source and target folder and the generated parameter files within each. Copy the .prm files from the source folder to the source GoldenGate dirprm directory and the ADD_TRANDATA.txt file to the GoldenGate home directory. From the target folder, copy the parameter file to the target server dirprm directory.
To begin the GoldenGate configuration, first edit the extract, pump, and replicat parameter files to add any options or commands necessary, such as trail file encryption. This can be done using the local system text editor or via GGSCI (the GoldenGate Software Command Interpreter). If there are no special options to add to the parameter files, they will work just fine without any modification.
Next, add the table-level supplemental logging on the source by running the ADD_TRANDATA.txt script file. This script file is just a list of GoldenGate commands that can be executed as a batch by calling the obey command. Log-in to GGSCI and run the following:
Once completed, remember to revoke the ALTER TABLE privilege from the source GoldenGate user in the source database.
REVOKE ALTER ANY TABLE FROM ogg_user;
The final piece of configuration is to generate the source definitions file. This file provides the GoldenGate replicat process the table and column definitions from the source database, allowing the process to interpret the table and column mappings in the replicat parameter file. The reason it must be used for replication to the SDS is because there are 3 additional columns mapped to each target table that do not exist in the source tables and cannot be mapped to the target table by name.
CDC$_RPL_LAST_UPDATE_DATE = @DATENOW (),
CDC$_SRC_LAST_UPDATE_DATE = @GETENV ("GGHEADER", "COMMITTIMESTAMP"),
CDC$_DML_CODE = "I" --"U" or "D"
The replicat last update date (CDC$_RPL_LAST_UPDATE_DATE) will be the timestamp when the record is updated in the target table. The source last update date (CDC$_SRC_LAST_UPDATE_DATE) is the commit timestamp for when the change occurred in the source. And finally the DML code (CDC$_DML_CODE) represents the type of change; insert, update, or delete.
To generate the source definitions file, exit GGSCI and run the following command:
./defgen paramfile ./dirprm/DEFGEN.prm
Once completed, copy the source definitions file from the source server to the target server dirdef directory.
There are two different options for performing the initial load of the target SDS schema from the source database. The first approach, using the process as delivered straight out-of-the-box, requires a scheduled downtime for the source application to ensure transactions that occur during the processing of data from source to SDS target are not skipped. The second, implemented with a slight customization to the Oracle BI Applications-delivered scripts, will ensure the initial load and replication startup provides a contiguous flow of transactions to the SDS schema and will require zero downtime for the source system.
Oracle BI Applications Scripts
As with the other set-up processes for the SDS schema and GoldenGate parameter files, there is an ODI Scenario available to execute that will generate the initial load scripts. In ODI Studio, browse to BI Apps Project > Components > SDS > Oracle > Copy OLTP to SDS. Expand Packages > Copy OLTP to SDS > Scenarios and you will find the Scenario “COPY_OLTP_TO_SDS Version 001″.
The Scenario calls an ODI Procedure named "Copy SDS Data". When executed, it will generate a script with an insert statement for each target SDS table using a select over a database link to the OLTP source. The link must be manually created and specifically named DW_TO_OLTP, as the ODI Procedure has the dblink name hard-coded.
--Drop and recreate the database link.
drop database link DW_TO_OLTP
--create database link DW_TO_OLTP
connect to SYSADM identified by SYSADM using 'PSDEMO';
--test the link.
select * from dual@DW_TO_OLTP;
The first approach to the initial load would then have you execute the Scenario to generate the insert statements, schedule a downtime for the OLTP application, and run the initial load scripts. Rather than go through those steps, let's take a look at how to eliminate the source system unavailability with a slight change to the code in the second approach.
Zero Downtime Initial Load
The initial load process will be customized to use the Oracle database flashback query capability, selecting data from the source transaction logs as of a specific point-in-time, based on the source SCN (system change number). Before the initial load is run, the GoldenGate extract process must be started in order to capture any transactions that occur during the data load. Finally, the GoldenGate replicat process will be started when the initial load completes, with the replication process picking up any transactions that occurred on the source after the initial load SCN, eliminating the chance to skip or duplicate transactions.
To perform the customizations, it is recommended to copy the entire "Copy OLTP to SDS" folder to a new location. One thing to note is that the Scenario will not be copied, since it must have a unique name throughout the work repository. We will generate the Scenario with a new name after making the changes to the script.
Open up the ODI Procedure "Copy SDS Data" from the copied directory. Click on the "Details" tab to review the steps. We will need to modify the step "Copy Data", which generates the DML script to move data from source to target. A review of the code will show that it uses the dictionary views on the source server, across the database link, to get all table and column names that are to be included in the script. The construction of the insert statement is the bit of code we will need to modify, adding the Oracle database flashback query syntax.
l_trunc_stmt := 'truncate table <$=jmodelSchema$>.' || col_rec.table_name;
l_sql_stmt := 'INSERT /*+ APPEND */ INTO <$=jmodelSchema$>.' || col_rec.table_name || ' (' || rtrim(l_column_list, ', ') || ') ' || 'SELECT ' || rtrim(l_column_expr, ', ') || ' FROM ' || col_rec.table_name || '@<%=DBLinkName%> as of scn #INITIAL_LOAD_SCN';
As you can see, #INITIAL_LOAD_SCN is a placeholder for an ODI Variable. A variable is used to perform the refresh of the SCN from the source rather than being hard-coded. Create the variable INITIAL_LOAD_SCN and set the query on the Refreshing tab to execute from the data warehouse over the database link, capturing the current SCN from the source database.
The user setup to connect to the OLTP source will need to temporarily be granted the "select any dictionary" privilege in order to allow the select from V$DATABASE. This privilege can later be revoked.
grant select any dictionary to SYSADM;
Now that the Variable is set and the Procedure code has been modified, we just need to put it all together in a Package and generate a Scenario. The Package "Copy OLTP to SDS" is already setup to call the Procedure "Copy SDS Data", so the ODI Variable INITIAL_LOAD_SCN can be added as a refresh step at the beginning of the Package.
After saving the Package, generate a Scenario to execute. When generating the Scenario, be sure to set all Variables except for INITIAL_LOAD_SCN as Startup Variables, as their values will be set manually during the execution of the Scenario, while the SCN is going to be refreshed from the source database. Also, remember to provide a different name than the original Scenario, as they must be unique throughout the ODI repository.
GoldenGate Startup and Script Execution
All of the pieces are in place to kick-off the initial load of the Source Dependent Data Store and fire up the GoldenGate replication. Even though the goal is to have zero downtime for the OLTP application, it would be best if the process were completed during a "slow" period, such as over the weekend, when a minimal amount of transactions are being processed.
First, start the GoldenGate extract and data pump processes so they begin capturing source transactions. On the source server, browse to the GoldenGate directory and run the GGSCI application. Ensure the Manager is running, and execute the "start extract" command for each of the processes that need to be kicked off.
Now that the transactions are flowing into the source trail and across the network to the target trail, we can execute the Scenario to generate the initial load script files. When executed, a prompt will appear, allowing you to enter the appropriate value for each variable. The script can be filtered down by a specific list of tables, if necessary, by adding a comma-delimited list to the TABLE_LIST variable. We'll just use a wildcard value to generate the script for all tables. Other options are to generate a script file (Y or N) and to execute the DML during the execution of the Scenario (even though the Variable is named RUN_DDL).
The script, named "BIA_SDS_Copy_Data_<session_number>.sql", will disable constraints, drop indexes, and truncate each table in the SDS prior to loading the data from the source system. After executing the copy data script, we will want to run the "BIA_SDS_Schema_Index_DDL_<session_number>.sql" script to recreate the indexes.
One thing to note - in the SDS Copy Data script the primary key constraints are disabled for a more greater performance during the insert of the data. But, the SDS Schema Index DDL code is set to create the constraint via an alter table script, rather than enabling the existing constraints. To work around this, open the Copy Data script in SQL Developer, copied all of the lines that are set to disable the constraints, paste them into a new window and switch the "disable" keyword to "enable" with a simple find and replace, and then execute the script against the SDS tables.
After copying the data and recreating the indexes in the SDS (and enabling the PK constraints), we can finally startup the replicat GoldenGate process on the target server. Again, login to GGSCI and ensure the Manager process is running. This time, when we start the process we will use the AfterCSN command, ensuring the replicat only picks up transactions from the trail file that occurred after the initial load SCN.
The initial load of data to the SDS schema has been completed and GoldenGate replication started, all without any impact to the source OLTP application. The next time the Source Data Extract (SDE) Load Plan is executed, it will be just as if it were running directly against the source database - only faster - since it's pulling data from the SDS schema on the same server, thanks to Oracle GoldenGate real-time data replication.
About the Author
Oracle ACE Michael Rainey is a Principal Consultant with Rittman Mead, where he specializes in business intelligence, with a focus on Oracle Data Integrator (11g / 12c), Oracle GoldenGate (11g / 12c), and Oracle Business Intelligence Applications 11g. Michael is also the lead instructor for the Rittman Mead Oracle Data Integrator "Bootcamp" in America, and serves as Secretary on the Northwest Oracle Users Group (NWOUG) Board of Directors.
Disclaimer: This article represents the expertise, findings, and opinion of the author. It has been published by Oracle in this space as part of a larger effort to encourage the exchange of such information within this Community, and to promote evaluation and commentary by peers. This article has not been reviewed by the relevant Oracle product team for compliance with Oracle's standards and practices, and its publication should not be interpreted as an endorsement by Oracle of the statements expressed therein.