This content has been marked as final. Show 5 replies
Basically purpose is to minimise the downtime to end user. When you do ETL's data keep changing, if user's run reports during ETL, reports will give inaccurate figures/totals. So if end user's will be running queries in schema B, they will not see any change until the ETL completes.
We expects replication to be faster(< 10 min's) than the ETL run (2 hrs).
Using synonyms/views would not be answer.
Any suggestion using streams?
If I understand you correctly then you have schema A with some tables. In those tables you are doing some ETL. When ETL is running and user query data from tables in schema A they should get inconsistent result. So you want to avoid this createing replica of tables in schema B.
In this case create materialized views in schema B with fast refresh option. Then create one refresh group and put all materialized views from schema B to this group. To ETL process add last step: refresh of this refresh group.
You users then query only schema B and data there will be always in consistent state.
Neal wrote:Hi Neal, If I understand you correctly, you basicly need a warehousing solution.
We have a requirement in our data warehouse where we have to replicate one schema to another in closer to real-time at a specific time of the day.
We run the incremental ETL's to Schema A in Database X, this process takes about 2 hours and brings 1-2GB data from scorce systems.
Schema A has total 350 GB size datafiles.
After this process completes, we want to move all the changed data into Schema B in Database X (same database, different schema) with minimal downtime to end users who runs reports in Schema B.
Since I'm not a DBA expert, I wondering the 11g streams (DBMS_STREAMS_ADM.MAINTAIN_SCHEMAS) can help to archive this.
Please share your thoughts, recommendations and any examples would be helpful.
Thanks in advance!
In this case spajdy suggestion would be a solution however I would not pick this option. Because you need to create Materialized View Log for all the tables which may take time.
I have used Oracle Golden Gate which properly works and does not disturb my production environment.
PS: If there is any trigger on those tables avoid using streams. it is a big headache and havent seen that streams worked properly yet.
I this case try consider following:
In schema A there is table T.
In schemat B there is table T_BAK created as 1:1 copy of table A.T.
In schema B there is a synonym pointing to B.T_BAK table.
After you ETL is done you simply redirect synonym B.T to A.T table.
Then you start copy A.T table to B.T_BAK table. When finisched redirect synonym B.T back to B.T_BAK.
When you run your ETL once a day you have all day to copy schema A to schema B.
This solition is simple but it have one big disadvantage: you must copy full size of schema A (350GB) every day.
So using Materialized views and Materialized views log is cheaper in this point.
But you can modify my previous variant to this one.
You have two set of tables.
One set in schema A and second set in schema B.
There is sysnonym for every table in schema C.
Start point is this:
schema A = schema B - same structures, same data
synonyms in schema C point to tables to schema A
Run ETL on schema B.
After ETL on schema B is done you redirect synonyms in schema C to schema B - in this point your uses can't use schema C because it is not in consistent state until all synonym are redirected.
Run ETL on schema A.
Next day start ETL on schema A. Then redirect synonym in schema C to schema A. Then run ETL in schema B.
And this do day after day.
Last option is use flashback technology.
Before you start you ETL you record actual timestamp and when ETL is running all queries must use flashback to ask how data look in stored timestamp.
But you have to size archive area accordingly.