This content has been marked as final. Show 4 replies
the fix for A) is outside DMU anyway -> move to a RAW or BLOB datatype
see also Note:788156.1 AL32UTF8 / UTF8 (Unicode) Database Character Set Implications
B.12) Make sure you do not store "binary" (Encrypted) data in character datatypes (CHAR,VARCHAR2,LONG,CLOB).
the fix for b) is better done in dmu, you can easily set a correction for a whole database by using the "assumed characterset" feature
the fix for c) , well that depends, you can update those rows or so from a good source, this is basically recovery and depends highly on what is the actual problem/cause
(A) can also be done in DMU using the column modification features.
Is there any particular reason you want to do the data cleansing outside of DMU?
@Wzhang: The approach that we have followed for past 2 iterations is,
Install 11g, upgrade the 10g DB(No DMU for this release) and character conv with DMU and all these steps are in downtime.
We are losing a lot of time doing cleansing step
for ex: It took us 18 hours to fix the the column expansion step or selecting a characterset which suits a particular column. We prepared script for expansion issue on basis of rowid in 1st iteration. But That script failed in 2nd iteration and need to do all work again.
So, We want to MANUALLY cleansize DB as much as possible in 10g DB (pre-downtime) itself to reduce timing overall Downtime.
Data cleansing is potentially one of the most time-consuming steps in the migration process depending on the data volume and the extent of data exceptions found. The DMU is designed to allow most of the cleansing actions to be performed prior to the conversion downtime window without impact to the production environment. You can choose to have the cleansing actions committed to the database immediately (immediate cleansing) or saved and executed later as part of the conversion phase (scheduled cleansing). Many of the cleansing actions may not be easy to accomplish outside of the DMU or could require significant manual workload otherwise. In your case, I think you have several options:
1) Upgrade to a DMU-supported database release first and work on Unicode migration separately from the upgrade. This way you can leverage the DMU cleansing features to address most of the data issues beforehand and only deal with any incremental data issues in the migration downtime.
If you must do the upgrade and Unicode migration in the same downtime window:
2) Prepare scripts for operations like enlarging column sizes or migrating column data types based on the latest iteration to speed up the process. Keep in mind you may still need extra work as the incremental data changes since the last iteration could affect the cleansing requirements. For invalid data issues, if they are caused by all data being stored in a character set different from the database character set, then set the assumed database character set instead of setting assumed character sets for individual columns.
3) Use DMU and Streams setup to achieve near-zero downtime migration, see the page below for details: