This discussion is archived
11 Replies Latest reply: Oct 17, 2012 12:00 AM by 968286 RSS

Import/Export of Database with wrong character set?

968286 Newbie
Currently Being Moderated
Hi,

I have a problem with one of my databases. The database's encoding is ISO8859-1. However, the strings stored in it are encoded as Codepage 850 (DOS Latin-1). Don't ask why :-).
Now I need to convert the data to Windows-1252 and I got pretty stuck with that.
Since the database thinks that its contents are ISO8859-1, the dump file also contains information about that character set (both the old dump and the more recent data pump export file). With the old dump file format, it was possible to change the character set id (bytes 2 & 3 at the beginning of the dump file), but this character set id seems to be ignored now (I use Oracle 10.2).
I think, if I could change the character set information in the dump file to Codepage 850, then Oracle will automatically translate the wrong characters to the target character set during import.
Is there a way to override the character set information for the import process?

I tried to convert the wrong characters in a table with about 2 million rows. I tried to use the "translate"-method to replace 6 characters for each row in an UPDATE-statement. But that took more than 24 hours (on my local laptop machine), which is definitely too long because I need the translation to be done in about 20 hours and there is more than just one table in the database - each table with millions of rows.

Ideas, anyone? Your help will be highly appreciated.

Regards,
Lutz

Edited by: 965283 on 14.10.2012 23:54
  • 1. Re: Import/Export of Database with wrong character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    If your database stores all data consistently in WE8PC850, then you can migrate the database to WE8PC850. Run the CSSCAN utility specifying FROMCHAR=WE8PC850 TOCHAR=WE8PC850 and if all data is reported as changeless, run ?/rdbms/admin/csalter.plb. This should correct the database character set to the real one.

    Note: the CSSCAN utility may not discover columns that are really stored in WE8MSWIN1252, so you need to to research carefully to make sure all data is consistently in WE8PC850.


    -- Sergiusz
  • 2. Re: Import/Export of Database with wrong character set?
    968286 Newbie
    Currently Being Moderated
    Hi Sergiusz,

    thank you for the quick reply. I will give that a try today!
  • 3. Re: Import/Export of Database with wrong character set?
    968286 Newbie
    Currently Being Moderated
    Alright, I tried two things:

    First, I scanned the database with FROMCHAR=WE8PC850 and TOCHAR=WE8PC850. Scan log gave no errors and told me that there were would be no data loss or truncation.
    My parameter file contained the following parameters:
    USERID=btest_lk2/****
    FULL=Y
    USER=btest_lk2
    TOCHAR=WE8PC850
    FROMCHAR=WE8PC850
    PROCESS=1
    FEEDBACK=25
    LOG=d:\scan_lk
    ARRAY=1024000
    Running csalter.plb afterward gave me "Checking or converting phrase did not finish successfully. No database (national) character set will be altered. CSALTER finished unsuccessfully".

    I thought that maybe that was due to the FROMCHAR and TOCHAR parameters being equal. So I tried another scan with TOCHAR=WE8ISO8859P1 and FROMCHAR=WE8PC850. But to no avail. CSALTER finished unsuccessfully there, too.
    What am I doing wrong?

    Regards,
    Lutz
  • 4. Re: Import/Export of Database with wrong character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    FROMCHAR and TOCHAR should be the same. This is OK.

    Have you reviewed all scan.* files for any ORA- or other errors?


    -- Sergiusz
  • 5. Re: Import/Export of Database with wrong character set?
    Zoltan Kecskemethy Expert
    Currently Being Moderated
    Here are some docs - url you may find useful to solve your issue.
    [url http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch11charsetmig.htm#sthref1476]Oracle® Database Globalization Support Guide 10g Release 2 (10.2) - Changing the Database Character Set of an Existing Database
    [url http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch12scanner.htm#sthref1515]Oracle® Database Globalization Support Guide 10g Release 2 (10.2) - Character Set Scanner Utilities

    [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=225912.1]My Oracle Support: Changing the Database Character Set - a short overview. Note:225912.1
    [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=745809.1]My Oracle Support: Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner) Doc ID: 745809.1
    [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=444701.1]My Oracle Support: Csscan output explained Doc ID: 444701.1
    [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=258904.1]My Oracle Support: Solving Convertible or Lossy data in Data Dictionary objects when changing the NLS_CHARACTERSET Doc ID: 258904.1
    [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=555823.1]My Oracle Support: Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252 Doc ID: 555823.1
  • 6. Re: Import/Export of Database with wrong character set?
    968286 Newbie
    Currently Being Moderated
    I took a look at all csscan log files. There are no errors.

    scan.err looks like this:
    Database Scan Individual Exception Report

    [Database Scan Parameters]
    Parameter Value
    ================================
    CSSCAN Version v2.1
    Instance Name ora10win
    Database Version 10.2.0.1.0
    Scan type User tables
    User name btest_lk2
    Scan CHAR data? YES
    Database character set WE8ISO8859P1
    FROMCHAR WE8PC850
    TOCHAR WE8PC850
    Scan NCHAR data? NO
    Array fetch buffer size 1024000
    Number of processes 1
    Capture convertible data? NO
    ================================

    [Data Dictionary individual exceptions]

    [Application data individual exceptions]


    scan.txt looks like this:

    Database Scan Summary Report

    Time Started : 2012-10-16 07:46:31
    Time Completed: 2012-10-16 07:51:46

    Process ID Time Started Time Completed
    ================================
    1 2012-10-16 07:51:45 2012-10-16 07:51:45
    ================================

    [Database Size]

    Tablespace Used Free Total Expansion
    ================================
    SYSTEM 594,06M 1.455,94M 2.050,00M ,00K
    UNDOTBS1 24,44M 2.770,56M 2.795,00M ,00K
    SYSAUX 454,38M 15,63M 470,00M ,00K
    TEMP ,00K ,00K ,00K ,00K
    USERS 41.439,63M 7.327,88M 48.767,50M ,00K
    ================================
    Total 42.512,50M 11.570,00M 54.082,50M ,00K

    [Database Scan Parameters]

    Parameter Value
    ================================
    CSSCAN Version v2.1
    Instance Name ora10win
    Database Version 10.2.0.1.0
    Scan type User tables
    User name btest_lk2
    Scan CHAR data? YES
    Database character set WE8ISO8859P1
    FROMCHAR WE8PC850
    TOCHAR WE8PC850
    Scan NCHAR data? NO
    Array fetch buffer size 1024000
    Number of processes 1
    Capture convertible data? NO
    ================================

    [Scan Summary]

    All character type application data remain the same in the new character set

    [Data Dictionary Conversion Summary]

    Datatype Changeless Convertible Truncation Lossy
    ================================
    VARCHAR2 0 0 0 0
    CHAR 0 0 0 0
    LONG 0 0 0 0
    CLOB 0 0 0 0
    VARRAY 0 0 0 0
    ================================
    Total 0 0 0 0
    Total in percentage 0,000% 0,000% 0,000% 0,000%


    [Application Data Conversion Summary]

    Datatype Changeless Convertible Truncation Lossy
    ================================
    VARCHAR2 2.429 0 0 0
    CHAR 347 0 0 0
    LONG 0 0 0 0
    CLOB 0 0 0 0
    VARRAY 0 0 0 0
    ================================
    Total 2.776 0 0 0
    Total in percentage 100,000% 0,000% 0,000% 0,000%

    [Distribution of Convertible, Truncated and Lossy Data by Table]

    USER.TABLE Convertible Truncation Lossy
    ================================
    ================================

    [Distribution of Convertible, Truncated and Lossy Data by Column]

    USER.TABLE|COLUMN Convertible Truncation Lossy
    ================================
    ================================

    [Indexes to be Rebuilt]

    USER.INDEX on USER.TABLE(COLUMN)
    ================================
    ================================

    Edited by: 965283 on 15.10.2012 23:29
  • 7. Re: Import/Export of Database with wrong character set?
    Zoltan Kecskemethy Expert
    Currently Being Moderated
    I recommend to rerun csscan connecting "/ as sysdba"
    e.g.
    csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y FROMCHAR=WE8PC850 TOCHAR=WE8PC850 LOG=WE8PC850check CAPTURE=Y ARRAY=1000000 PROCESS=2
    BTW this is oracle recommendation too. see e.g referred 555823.1 doc...
  • 8. Re: Import/Export of Database with wrong character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    ## I recommend to rerun csscan connecting "/ as sysdba"

    Me too :)

    Note that the report says "Scan type: User tables". It should say "Scan type: Full database". This is because you specified the parameter USER=, which overrode FULL=Y. Do not specify USER=.


    -- Sergiusz
  • 9. Re: Import/Export of Database with wrong character set?
    968286 Newbie
    Currently Being Moderated
    I think this might be the cause of my problem. I have a database for testing purposes set up on my machine. In this database (instance) I have about 20 schemas. So I used the parameter "USER" to scan only one schema.

    It is not possible to change the character set for only one schema in this database, right? At least not with CSALTER?
  • 10. Re: Import/Export of Database with wrong character set?
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    No, it is not. An Oracle database has one database character set declaration (NLS_CHARACTERSET) and one national character set declaration (NLS_NCHAR_CHARACTERSET). These settings are per database, not per schema.

    -- Sergiusz
  • 11. Re: Import/Export of Database with wrong character set?
    968286 Newbie
    Currently Being Moderated
    Thank you for your help! I guess I will have to look for other ways to convert the wrong characters, then.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points