6 Replies Latest reply: Mar 6, 2013 4:19 PM by Sergiusz Wolicki-Oracle RSS

    Character replacement before character set change

    bynummike
      We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
      update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');
        • 1. Re: Character replacement before character set change
          Frank Kulash
          Hi,
          bynummike wrote:
          We are preparing to change our character set from us7ascii to al32utf8 on 11.2.0.3 HP-UX. The majority of the "lossy" rows identified contain accented and umlauted vowels. I have tried to change the accented and umlauted characters to their "plain" counterparts (eg, accented a to 'a'). I've tried several variations of the replace and translate functions with no succcess. One of the statements I was "sure" that it would work follows - what am I doing wrong?
          update tbl1 set desc_fld = regexp_replace(desc_fld, '[=a=]', 'a');
          Try:
          update tbl1 
          set desc_fld = regexp_replace (desc_fld,  '[[=a=]]', 'a');
          <tt> [=a=] </tt> means "any variation on the letter a" only when it's inside square-brackets; otherwise, it looks like you want the set of characters consisting of '=', 'a' and '='.
          • 2. Re: Character replacement before character set change
            bynummike
            Thanks for the reply - I was missing the "extra" brackets but I tried:

            SQL> update je_ddet set debit_memo = regexp_replace(debit_memo, '[[=a=]]', 'a');

            237273 rows updated.

            SQL> commit;

            Commit complete.

            SQL> select debit_memo from je_ddet where rowid = 'AACL6QACjAAAqgrAAZ';

            DEBIT_MEMO
            ----------------------------------------------------------------------
            recl trsf à BCV 08.04.09

            It still has the accented a.
            • 3. Re: Character replacement before character set change
              chris227
              As far as i understand you have data corruption in your ascii7_DB. You stored 2-byte characters in a 1-byte characterset. This seems to work as the characters were dsiplayed correctyl, but know you are facing some problems.
              I cant reproduce the issue it because i dont have a ascci7-db running.
              So i might be that those nls-based functions dont really work on the corrupted data.
              I would examine the dumps of the data (dump(debit_memo, 1010)) in both environments.
              Perhaps something on raw-level is possible.
              • 4. Re: Character replacement before character set change
                Frank Kulash
                Hi,

                Since your current character set is US7ASCII, it may not recognize characters with codes above 127 as members of pre-defined set, such as [=a=].

                Is it possible to type all of the accented characters? If so, you could do something like:
                ...
                SET  debit_memo = TRANSLATE ( debit_memo
                                   , 'âäàåáé...'
                                   , 'aaaaae...'
                                      )
                Don't test on your full table with 200,000 rows. Create a small test table and experiment on it, first.
                • 5. Re: Character replacement before character set change
                  bynummike
                  THANK YOU!!! The translate with the literals worked as long as I run it from the client side. I have to copy & paste the extended characters in sql developer/worksheet but that works! I've been running in circles with this all day. Thanks again.
                  • 6. Re: Character replacement before character set change
                    Sergiusz Wolicki-Oracle
                    Why replacing? Why not preserving the characters? Use the Database Migration Assistant for Unicode to perform the migration and to solve (cleanse) convertibility issues. You can use the Assumed Database Character Set property to declare the actual character set of your data (usually WE8MSWIN1252 for Western European languages) to use for migration.

                    See: http://www.oracle.com/technetwork/products/globalization/dmu/overview/index.html


                    -- Sergiusz