9 Replies Latest reply on May 10, 2011 5:25 PM by orafad

    Data Entry Error

    TheHades0210
      Aloha!

      I'm doing some data translation(language translation to be exact) in one of our database. During translation I've notice some inconsistency on the data that I input, since I'm translating English to Vietnamese, some Vietnamese characters when entered are in "?" which means that the database can't read that certain character. I've tried to configure the character set to "VN8MSWIN1258" which is the char set for Vietnam but no success, still the same error/inconsistency.

      Hope someone can share some enlightenment regrading this issue.

      With thanks,

      Hades
        • 1. Re: Data Entry Error
          Sergiusz Wolicki-Oracle
          Hope someone can share some enlightenment regrading this issue.

          Sure we can. But we need to know much more details. You have provided no information except that you work with Vietnamese. You have said no single word about the application that you use to enter and query the translation and this is usually the most interesting part of the whole picture. It is not even clear what you meant by "I've tried to configure the character set". Where? How?


          -- Sergiusz
          • 2. Re: Data Entry Error
            orafad
            Please provide some details: How are you inputting data? What is the source of the data? What characters? The definition of relevant table columns?

            Win-1258 may need some special consideration regarding its use of combining diacritics for Vietnamese letters. If you have a unicode source i.e. precomposed characters, then they may need to be split up (decomposed) first, to fit the destination db character set. In such a case, a Unicode database (AL32UTF8) might be a better option (if it's possible to create a new db or convert db).

            Edited by: orafad on May 5, 2011 1:26 AM
            • 3. Re: Data Entry Error
              TheHades0210
              Aloha!

              I am using Benthic Software and SQL Dev to access and input data to the database. With regards to character set, the database i've created is set to Vietnamese charset(VN8MSWIN1258). Also i've tried to edit the NLS_LANG of the client machine to "AMERICAN_AMERICA.VN8MSWIN1258" but still not success. e.g if i input "ÁP DỤNG VÀO TƯ - MASK" the database will write(or even in the editor app itself)im having this display of characters "ÁP D?NG VÀO TÝ - MASK". One of the character is in "?".

              database parameters:

              NLS_LANGUAGE      AMERICAN
              NLS_TERRITORY      AMERICA
              NLS_CURRENCY      $
              NLS_ISO_CURRENCY      AMERICA
              NLS_NUMERIC_CHARACTERS      .,
              NLS_CHARACTERSET      VN8MSWIN1258
              NLS_CALENDAR GREGORIAN
              NLS_DATE_FORMAT      DD-MON-RR
              NLS_DATE_LANGUAGE     AMERICAN
              NLS_SORT     BINARY
              NLS_TIME_FORMAT HH.MI.SSXFF AM
              NLS_TIMESTAMP_FORMAT DD-MON-RR HH.MI.SSXFF AM
              NLS_TIME_TZ_FORMAT     HH.MI.SSXFF AM TZR
              NLS_TIMESTAMP_TZ_FORMAT     DD-MON-RR HH.MI.SSXFF AM TZR
              NLS_DUAL_CURRENCY     $
              NLS_COMP      BINARY
              NLS_LENGTH_SEMANTICS     BYTE
              NLS_NCHAR_CONV_EXCP     FALSE
              NLS_NCHAR_CHARACTERSET     AL16UTF16
              NLS_RDBMS_VERSION     11.2.0.1.0
              • 4. Re: Data Entry Error
                orafad
                TheHades0210 wrote:
                I am using Benthic Software
                What tool, specifically, including version string?
                to access and input data to the database.
                By what method do you input data? Plain insert? If so, can you provide a sample insert statement?
                With regards to character set, the database i've created is set to Vietnamese charset(VN8MSWIN1258).
                Also i've tried to edit the NLS_LANG of the client machine to "AMERICAN_AMERICA.VN8MSWIN1258"
                For non-unicode programs (tools) or environments sensitive to system locale, you first need to setup client OS properly - e.g. change Windows Regional/Language settings so that acp 1258 (in this case) is employed. Just setting NLS_LANG is not enough and should be done after the fact; changing the variable value does not change os char set. Depending on program used etc. this may or may not be relevant to your issue (i.e. NLS_LANG is not always used).

                but still not success. e.g if i input "ÁP DỤNG VÀO TƯ - MASK" the database will write(or even in the editor app itself)
                im having this display of characters "ÁP D?NG VÀO TÝ - MASK". One of the character is in "?".
                I can't say for sure why that happens, more info is needed.

                The conversion of Ư to Ý might be indicative of difference in system locale, since 0xdd in win-1252 represents Ý.

                It might be some problem with font (e.g. glyph missing). But substitue glyph is usually something other than a question mark.

                Another possibility is that unicode to ansi conversion is happening (in client/editor app), but maybe not part of the problem.
                How do you input/write these characters? (e.g. copy-paste from word?)

                Try changing system locale and see what happens next. Maybe suggestion to change system locale was a bit hasty.

                Please run the following command to check code page (acp).
                C:\>reg query HKLM\System\CurrentControlSet\Control\Nls\Codepage /v acp
                Verify what is stored in the db corresponding to characters input above with:
                select column, dump(column, 1016)
                from table
                where suitable_condition_to_retrieve_row ...; 
                Edit:
                Tried to clarify some parts, relaxed default locale theory a bit, strike changing locale, added a few things to check.

                Edited by: orafad on May 5, 2011 11:18 PM

                Edited by: orafad on May 6, 2011 12:13 AM
                • 5. Re: Data Entry Error
                  Sergiusz Wolicki-Oracle
                  Do you see the problem both in Benthic Software and in SQL Developer? Note, NLS_LANG is irrelevant for the SQL Developer. Also, if the Benthic software (Golden 6?) has the option "Use Uncode Oracle Client Calls" checked on the login options window, then the NLS_LANG should be irrelevant for that software as well. Make sure this option is checked.

                  Which version of Windows do you use? Changing the system locale may not be necessary but it depends on the application.


                  -- Sergiusz
                  • 6. Re: Data Entry Error
                    orafad
                    S. Wolicki, Oracle wrote:
                    Note, NLS_LANG is irrelevant for the SQL Developer.
                    Is that true in any circumstance?

                    My current version of SQL Developer has a tickbox to use Jdbc OCI (Preferences, Database > Advanced). If the box is ticked, it searches system paths for any Oracle client libraries. Now, say, first in path comes pre-10g libraries (ocijdbc, oci, and so on). Is NLS_LANG still ignored in that (perhaps very rare) case?
                    • 7. Re: Data Entry Error
                      Sergiusz Wolicki-Oracle
                      JDBC 10g OCI will not work with libraries older than 9.2 because they do not have the OCIEnvNlsCreate() function. But in the context of the NLS_LANG, even 9.2 libraries will work the same as 10g/11g libraries, because the behavior is dictated by the driver code, not the OCI code. There is an undocumented property to revert JDBC 11g OCI behavior to 9.2 (where NLS_LANG value is used to select one of the US7ASCII, WE8ISO8859P1, AL24UTFFSS, or UTF8 as the network character set), but it should not generally be needed in any correctly configured system.

                      Therefore, NLS_LANG is ignored as far as the character set part is concerned. Also, the language and territory parts are overridden by an explicit ALTER SESSION statement sent by the driver based on Java locale. The only thing that matters is if NLS_LANG is defined at all. If it is, OCI will read the NLS_COMP and NLS_LENGTH_SEMANTICS environment variable settings and send them to the server . Otherwise, these settings will not be sent and they will default from instance parameters. I consider this behavior wrong and it may change in the future. Therefore, NLS_COMP and NLS_LENGTH_SEMANTICS should be set by applications explicitly.


                      -- Sergiusz
                      • 8. Re: Data Entry Error
                        TheHades0210
                        Aloha!

                        The solution in order for the database to accept certain special characters is to set the database(during creation) from default character set (WE8MSWIN1252) to Unicode (AL32UTF8). Also after completing the database creation, run the following "alter system set NLS_LENGTH_SEMANTICS = 'CHAR' scope=both;". Also you need to consider the client to use, in my case I used SQL Dev.

                        Thanks for all you time and effort spent on my queries.

                        Hades,
                        • 9. Re: Data Entry Error
                          orafad
                          Yes, suggested earlier and, mainly, the preferred 'Database character set' is AL32UTF8 from e.g. [url http://download.oracle.com/docs/cd/E11882_01/server.112/e10729/ch2charset.htm#NLSPG178]Globalization Support guide, Choosing a character set and also Installation Guides of recent releases.

                          About length sematics, there's an important point in {thread:id=2192933}.

                          Edited by: orafad on May 10, 2011 7:24 PM