11 Replies Latest reply: Oct 17, 2012 2:00 AM by 968286 RSS

    Import/Export of Database with wrong character set?

    968286
      Hi,

      I have a problem with one of my databases. The database's encoding is ISO8859-1. However, the strings stored in it are encoded as Codepage 850 (DOS Latin-1). Don't ask why :-).
      Now I need to convert the data to Windows-1252 and I got pretty stuck with that.
      Since the database thinks that its contents are ISO8859-1, the dump file also contains information about that character set (both the old dump and the more recent data pump export file). With the old dump file format, it was possible to change the character set id (bytes 2 & 3 at the beginning of the dump file), but this character set id seems to be ignored now (I use Oracle 10.2).
      I think, if I could change the character set information in the dump file to Codepage 850, then Oracle will automatically translate the wrong characters to the target character set during import.
      Is there a way to override the character set information for the import process?

      I tried to convert the wrong characters in a table with about 2 million rows. I tried to use the "translate"-method to replace 6 characters for each row in an UPDATE-statement. But that took more than 24 hours (on my local laptop machine), which is definitely too long because I need the translation to be done in about 20 hours and there is more than just one table in the database - each table with millions of rows.

      Ideas, anyone? Your help will be highly appreciated.

      Regards,
      Lutz

      Edited by: 965283 on 14.10.2012 23:54
        • 1. Re: Import/Export of Database with wrong character set?
          Sergiusz Wolicki-Oracle
          If your database stores all data consistently in WE8PC850, then you can migrate the database to WE8PC850. Run the CSSCAN utility specifying FROMCHAR=WE8PC850 TOCHAR=WE8PC850 and if all data is reported as changeless, run ?/rdbms/admin/csalter.plb. This should correct the database character set to the real one.

          Note: the CSSCAN utility may not discover columns that are really stored in WE8MSWIN1252, so you need to to research carefully to make sure all data is consistently in WE8PC850.


          -- Sergiusz
          • 2. Re: Import/Export of Database with wrong character set?
            968286
            Hi Sergiusz,

            thank you for the quick reply. I will give that a try today!
            • 3. Re: Import/Export of Database with wrong character set?
              968286
              Alright, I tried two things:

              First, I scanned the database with FROMCHAR=WE8PC850 and TOCHAR=WE8PC850. Scan log gave no errors and told me that there were would be no data loss or truncation.
              My parameter file contained the following parameters:
              USERID=btest_lk2/****
              FULL=Y
              USER=btest_lk2
              TOCHAR=WE8PC850
              FROMCHAR=WE8PC850
              PROCESS=1
              FEEDBACK=25
              LOG=d:\scan_lk
              ARRAY=1024000
              Running csalter.plb afterward gave me "Checking or converting phrase did not finish successfully. No database (national) character set will be altered. CSALTER finished unsuccessfully".

              I thought that maybe that was due to the FROMCHAR and TOCHAR parameters being equal. So I tried another scan with TOCHAR=WE8ISO8859P1 and FROMCHAR=WE8PC850. But to no avail. CSALTER finished unsuccessfully there, too.
              What am I doing wrong?

              Regards,
              Lutz
              • 4. Re: Import/Export of Database with wrong character set?
                Sergiusz Wolicki-Oracle
                FROMCHAR and TOCHAR should be the same. This is OK.

                Have you reviewed all scan.* files for any ORA- or other errors?


                -- Sergiusz
                • 5. Re: Import/Export of Database with wrong character set?
                  Zoltan Kecskemethy
                  Here are some docs - url you may find useful to solve your issue.
                  [url http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch11charsetmig.htm#sthref1476]Oracle® Database Globalization Support Guide 10g Release 2 (10.2) - Changing the Database Character Set of an Existing Database
                  [url http://download.oracle.com/docs/cd/B19306_01/server.102/b14225/ch12scanner.htm#sthref1515]Oracle® Database Globalization Support Guide 10g Release 2 (10.2) - Character Set Scanner Utilities

                  [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=225912.1]My Oracle Support: Changing the Database Character Set - a short overview. Note:225912.1
                  [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=745809.1]My Oracle Support: Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner) Doc ID: 745809.1
                  [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=444701.1]My Oracle Support: Csscan output explained Doc ID: 444701.1
                  [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=258904.1]My Oracle Support: Solving Convertible or Lossy data in Data Dictionary objects when changing the NLS_CHARACTERSET Doc ID: 258904.1
                  [url https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=555823.1]My Oracle Support: Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252 Doc ID: 555823.1
                  • 6. Re: Import/Export of Database with wrong character set?
                    968286
                    I took a look at all csscan log files. There are no errors.

                    scan.err looks like this:
                    Database Scan Individual Exception Report

                    [Database Scan Parameters]
                    Parameter Value
                    ================================
                    CSSCAN Version v2.1
                    Instance Name ora10win
                    Database Version 10.2.0.1.0
                    Scan type User tables
                    User name btest_lk2
                    Scan CHAR data? YES
                    Database character set WE8ISO8859P1
                    FROMCHAR WE8PC850
                    TOCHAR WE8PC850
                    Scan NCHAR data? NO
                    Array fetch buffer size 1024000
                    Number of processes 1
                    Capture convertible data? NO
                    ================================

                    [Data Dictionary individual exceptions]

                    [Application data individual exceptions]


                    scan.txt looks like this:

                    Database Scan Summary Report

                    Time Started : 2012-10-16 07:46:31
                    Time Completed: 2012-10-16 07:51:46

                    Process ID Time Started Time Completed
                    ================================
                    1 2012-10-16 07:51:45 2012-10-16 07:51:45
                    ================================

                    [Database Size]

                    Tablespace Used Free Total Expansion
                    ================================
                    SYSTEM 594,06M 1.455,94M 2.050,00M ,00K
                    UNDOTBS1 24,44M 2.770,56M 2.795,00M ,00K
                    SYSAUX 454,38M 15,63M 470,00M ,00K
                    TEMP ,00K ,00K ,00K ,00K
                    USERS 41.439,63M 7.327,88M 48.767,50M ,00K
                    ================================
                    Total 42.512,50M 11.570,00M 54.082,50M ,00K

                    [Database Scan Parameters]

                    Parameter Value
                    ================================
                    CSSCAN Version v2.1
                    Instance Name ora10win
                    Database Version 10.2.0.1.0
                    Scan type User tables
                    User name btest_lk2
                    Scan CHAR data? YES
                    Database character set WE8ISO8859P1
                    FROMCHAR WE8PC850
                    TOCHAR WE8PC850
                    Scan NCHAR data? NO
                    Array fetch buffer size 1024000
                    Number of processes 1
                    Capture convertible data? NO
                    ================================

                    [Scan Summary]

                    All character type application data remain the same in the new character set

                    [Data Dictionary Conversion Summary]

                    Datatype Changeless Convertible Truncation Lossy
                    ================================
                    VARCHAR2 0 0 0 0
                    CHAR 0 0 0 0
                    LONG 0 0 0 0
                    CLOB 0 0 0 0
                    VARRAY 0 0 0 0
                    ================================
                    Total 0 0 0 0
                    Total in percentage 0,000% 0,000% 0,000% 0,000%


                    [Application Data Conversion Summary]

                    Datatype Changeless Convertible Truncation Lossy
                    ================================
                    VARCHAR2 2.429 0 0 0
                    CHAR 347 0 0 0
                    LONG 0 0 0 0
                    CLOB 0 0 0 0
                    VARRAY 0 0 0 0
                    ================================
                    Total 2.776 0 0 0
                    Total in percentage 100,000% 0,000% 0,000% 0,000%

                    [Distribution of Convertible, Truncated and Lossy Data by Table]

                    USER.TABLE Convertible Truncation Lossy
                    ================================
                    ================================

                    [Distribution of Convertible, Truncated and Lossy Data by Column]

                    USER.TABLE|COLUMN Convertible Truncation Lossy
                    ================================
                    ================================

                    [Indexes to be Rebuilt]

                    USER.INDEX on USER.TABLE(COLUMN)
                    ================================
                    ================================

                    Edited by: 965283 on 15.10.2012 23:29
                    • 7. Re: Import/Export of Database with wrong character set?
                      Zoltan Kecskemethy
                      I recommend to rerun csscan connecting "/ as sysdba"
                      e.g.
                      csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y FROMCHAR=WE8PC850 TOCHAR=WE8PC850 LOG=WE8PC850check CAPTURE=Y ARRAY=1000000 PROCESS=2
                      BTW this is oracle recommendation too. see e.g referred 555823.1 doc...
                      • 8. Re: Import/Export of Database with wrong character set?
                        Sergiusz Wolicki-Oracle
                        ## I recommend to rerun csscan connecting "/ as sysdba"

                        Me too :)

                        Note that the report says "Scan type: User tables". It should say "Scan type: Full database". This is because you specified the parameter USER=, which overrode FULL=Y. Do not specify USER=.


                        -- Sergiusz
                        • 9. Re: Import/Export of Database with wrong character set?
                          968286
                          I think this might be the cause of my problem. I have a database for testing purposes set up on my machine. In this database (instance) I have about 20 schemas. So I used the parameter "USER" to scan only one schema.

                          It is not possible to change the character set for only one schema in this database, right? At least not with CSALTER?
                          • 10. Re: Import/Export of Database with wrong character set?
                            Sergiusz Wolicki-Oracle
                            No, it is not. An Oracle database has one database character set declaration (NLS_CHARACTERSET) and one national character set declaration (NLS_NCHAR_CHARACTERSET). These settings are per database, not per schema.

                            -- Sergiusz
                            • 11. Re: Import/Export of Database with wrong character set?
                              968286
                              Thank you for your help! I guess I will have to look for other ways to convert the wrong characters, then.