This discussion is archived
9 Replies Latest reply: Oct 12, 2012 4:01 PM by orafad RSS

sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output

sbroeckmann Newbie
Currently Being Moderated
hi,

we have an unicode database with cyrillic content and I wanted to select some data in (windows) sqlplus (sounds easy...).

These are the environments

Oracle DB
select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for HPUX: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
sqlplus
SQL*Plus: Release 11.2.0.1.0 Production on Thu Oct 11 11:20:47 2012

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

show all :

appinfo is OFF and set to "SQL*Plus"
arraysize 5000
autocommit OFF
autoprint OFF
autorecovery OFF
autotrace OFF
blockterminator "." (hex 2e)
btitle OFF and is the first few characters of the next SELECT statement
cmdsep OFF
colsep " "
compatibility version NATIVE
concat "." (hex 2e)
copycommit 0
COPYTYPECHECK is ON
define "&" (hex 26)
describe DEPTH 1 LINENUM OFF INDENT ON
echo ON
editfile "D:\tmp\afiedt.buf"
embedded OFF
escape OFF
escchar OFF
exitcommit ON
FEEDBACK ON for 1 or more rows
flagger OFF
flush OFF
heading ON
headsep "|" (hex 7c)
instance "local"
linesize 4000
lno 4
loboffset 1
logsource ""
long 999999
longchunksize 80
markup HTML OFF HEAD "<style type='text/css'> body {font:10pt Arial,Helvetica,sans-serif; color:black; background:White;} p {font:10pt Arial,Helvetica,sans-serif; color:black; background:White;} table,tr,td {font:10pt Arial,Helvetica,sans-serif; color:Black; background:#f7f7e7; padding:0px 0px 0px 0px; margin:0px 0px 0px 0px;} th {font:bold 10pt Arial,Helvetica,sans-serif; color:#336699; background:#cccc99; padding:0px 0px 0px 0px;} h1 {font:16pt Arial,Helvetica,Geneva,sans-serif; color:#336699; background-color:White; border-bottom:1px solid #cccc99; margin-top:0pt; margin-bottom:0pt; padding:0px 0px 0px 0px;-
} h2 {font:bold 10pt Arial,Helvetica,Geneva,sans-serif; color:#336699; background-color:White; margin-top:4pt; margin-bottom:0pt;} a {font:9pt Arial,Helvetica,sans-serif; color:#663300; background:#ffffff; margin-top:0pt; margin-bottom:0pt; vertical-align:top;}</style><title>SQL*Plus Report</title>" BODY "" TABLE "border='1' width='90%' align='center' summary='Script output'" SPOOL OFF ENTMAP ON PREFORMAT OFF
newpage 1
null ""
numformat ""
numwidth 10
pagesize 999
PAUSE is OFF
pno 1
recsep WRAP
recsepchar " " (hex 20)
release 1002000400
repfooter OFF and is NULL
repheader OFF and is NULL
securedcol is OFF
serveroutput ON SIZE UNLIMITED FORMAT WORD_WRAPPED
shiftinout INVISIBLE
showmode OFF
spool OFF
sqlblanklines OFF
sqlcase MIXED
sqlcode 0
sqlcontinue "> "
sqlnumber OFF
sqlpluscompatibility 11.2.0
sqlprefix "#" (hex 23)
sqlprompt "> "
sqlterminator ";" (hex 3b)
suffix "sql"
tab OFF
termout ON
timing OFF
trimout ON
trimspool ON
ttitle OFF and is the first few characters of the next SELECT statement
underline "-" (hex 2d)
USER is "nyuser"
verify ON
wrap : lines will be wrapped
errorlogging is OFF
Windows settings (cmd)
chcp 65001
set NLS_LANG=.AL32UTF8
Create table and insert one row:
create table forumexample (longdescr varchar2(128));

insert into forumexample (longdescr) values (unistr( '\0420\0443\0441\0441\043A\0438\0439' ));
now lets select some data

First shot (naive) :
select longdescr from forumexample;

LONGDESCR
------------------------
��усский

1 row selected.
hmm...2 goofy "question mark somethings"....the rest looks ok...

Suspect: There is no unicode in the DB....something went wrong ...

Second shot (selfcritical) :
select longdescr,dump(longdescr) dump_longdescr from forumexample;

LONGDESCR                DUMP_LONGDESCR
------------------------ -------------------------------------------------------------------------
��усский                  Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185

1 row selected.
so the first char is 208 160 which is something like a big "P" (like in Painful experience :-)):
U+0420     Р     208 160
looks good (=it is unicode)

Note (see output of "second shot") :

The output of the second column is shifted on digit to the right...?!
LONGDESCR                DUMP_LONGDESCR
------------------------ -------------------------------------------------------------------------
��усский                  Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185
                          x
                          x
                          x
Third shot (desparation):

Put a blank in front of the first column :
select ' '||longdescr longdescr,dump(longdescr) dump_longdescr from forumexample;

LONGDESCR                DUMP_LONGDESCR
------------------------ -----------------------------------------------------------------------
 Русский                 Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185

1 row selected.
=> Now the 208-161 character is shown...without problems....( so the windows font is able to show it !!)

Fourth shot (curiosity):

Select cyrillic attribute twice:

select longdescr longdescr_1,longdescr longdescr_2,dump(longdescr) dump_longdescr from forumexample;

LONGDESCR_1              LONGDESCR_2              DUMP_LONGDESCR
------------------------ ------------------------ ------------------------------------------------------------------------
��усский                  Русский                  Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185

1 row selected.

Select 2 rows:

select longdescr longdescr_1,dump(longdescr) dump_longdescr from forumexample union all
select longdescr longdescr_1,dump(longdescr) dump_longdescr from forumexample ;

LONGDESCR_1              DUMP_LONGDESCR
------------------------ -------------------------------------------------------------------------
��усский                  Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185
��усский                  Typ=1 Len=14: 208,160,209,131,209,129,209,129,208,186,208,184,208,185

2 rows selected.

etc....
Conclusion :

I have did some more testing and the basic issue seems to be :

If the first char in the first attribute of a select is a multi byte character : sqlplus puts in 2 "funny signs" and shifts the complete row,
one digit to the right...

Additional remarks:

- Suspect is, that one of sqlplus parameter should be changed to change this behaviour, but I could not identify which one...

- Creating the table differently
create table forumexample2 (longdescr nvarchar2(128));
(with NVARCHAR2) has not different result...


Maybe one of you has a clue about this one...

thx in advance

Bye

Stefan

Edited by: sbroeckmann on 11.10.2012 06:21
  • 1. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    Paul Horth Expert
    Currently Being Moderated
    Sql*Plus on windows does not have any real capability to show international characters.

    Use SQL Developer or other more modern client.


    In SQL Developer:
    select *
    from forumexample;
    
    Русский
  • 2. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    Sven W. Guru
    Currently Being Moderated
    >
    The behaviour as described sounds like a bug to me. You might want to consider raising it in oracle support.
    Windows settings (cmd)
    chcp 65001
    set NLS_LANG=.AL32UTF8
    This is somewhat unusual. AL32utf8 is the oracle database character set. The client character set should be differnt.
    More like "WE8ISO8859P5" (windows).

    Furthermore you are using a 11g sql plus client on a 10g database. I'm not aware of any problems with that, however for such strange problems you might consider testing it with an older client.
    Or set the compatibility to a differnt value
    SET SQLPLUSCOMPAT 10.2
    Edited by: Sven W. on Oct 11, 2012 6:10 PM
  • 3. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    orafad Oracle ACE
    Currently Being Moderated
    sbroeckmann wrote:
    we have an unicode database with cyrillic content and I wanted to select some data in (windows) sqlplus (sounds easy...).
    Use Oracle SQL Developer instead.

    No-install unzip (Downloads tab):
    http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html


    >
    chcp 65001
    set NLS_LANG=.AL32UTF8
    Why?


    Use SQL Developer as suggested, or try:
    C:\Users\>chcp 1251
    Active code page: 1251
    
    C:\Users\>set nls_lang=.cl8mswin1251
    
    C:\Users\>sqlplus test
    
    SQL*Plus: Release 11.2.0.2.0 Production on Fri Oct 12 00:40:16 2012
    
    Copyright (c) 1982, 2010, Oracle.  All rights reserved.
    
    Enter password:
    ERROR:
    ORA-28002: the password will expire within 7 days
    
    
    
    Connected to:
    Oracle Database 11g Express Edition Release 11.2.0.2.0 - Production
    
    SQL> create table forumexample (longdescr varchar2(128));
    
    Table created.
    
    SQL>
    SQL> insert into forumexample (longdescr) values (unistr( '\0420\0443\0441\0441\043A\0438\0439' ));
    
    1 row created.
    
    SQL> commit;
    
    Commit complete.
    
    SQL> select longdescr,dump(longdescr,1016) dump from forumexample;
    
    LONGDESCR
    --------------------------------------------------------------------------------
    DUMP
    --------------------------------------------------------------------------------
    Русский
    Typ=1 Len=14 CharacterSet=AL32UTF8: d0,a0,d1,83,d1,81,d1,81,d0,ba,d0,b8,d0,b9
    (Cyrillic letter ER code units are 0xd0 a0, ... and so on)

    Edited by: orafad on Oct 12, 2012 12:47 AM
  • 4. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    sbroeckmann Newbie
    Currently Being Moderated
    Sql*Plus on windows does not have any real capability to show international characters. 
    ...but only the first character does not have this capability...?!
    Use SQL Developer or other more modern client.
    I will have a look at it...
  • 5. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    sbroeckmann Newbie
    Currently Being Moderated
    Sven W. wrote
    This is somewhat unusual. AL32utf8 is the oracle database character set. The client character set should be differnt.
    More like "WE8ISO8859P5" (windows).
    I found these settings somewhere as "the settings for utf-8 support of sqlplus"...and as you can see : It works for everything except the first char....

    I'm not aware of any problems with that, however for such strange problems you might consider testing it with an older client.
    UNfortunately I cannot use different (Oracle 10) installation, as I am limited to the the standard environment, that we get from our DBAs....

    Or set the compatibility to a differnt value
    SET SQLPLUSCOMPAT 10.2
    tried but didn't work.
  • 6. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    sbroeckmann Newbie
    Currently Being Moderated
    orafad wrote: 
    Use Oracle SQL Developer instead.
    I will have a look at it...
    chcp 65001
    set NLS_LANG=.AL32UTF8
    
    
    Why?
    I found somewhere, that these are the settings for unicode support of sqlplus and except for the first char, this seems to be "ok"...
    Use SQL Developer as suggested, or try:
    
    C:\Users\>chcp 1251
    Active code page: 1251
     
    C:\Users\>set nls_lang=.cl8mswin1251
    etc...
    That is true, this works for cyrillic, but as we also other multi byte data in the DB, I wanted to have one "unicode" - environment,where I can handle everything without switching to different environments...

    But OK, seems to be more tricky than expected... so, I will setup various settings...

    Thx to all of you
  • 7. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    orafad Oracle ACE
    Currently Being Moderated
    sbroeckmann wrote:
    That is true, this works for cyrillic, but as we also other multi byte data in the DB, I wanted to have one "unicode" - environment,where I can handle everything without switching to different environments...
    If you want to work with character data of multiple languages, instead of trying to "hack" sqlplus and use it in a manner it was not designed to be used,
    please use a Unicode enabled client such as - mentioned now for the fourth time or so ta da! -- Oracle SQL Developer.

    Edited by: orafad on Oct 12, 2012 10:39 AM
  • 8. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    Sergiusz Wolicki (Oracle) Expert
    Currently Being Moderated
    This is actually new to me that Command Prompt can be configured for UTF-8 (code page 65001). I recall having problems with this configuration some time ago. I wonder if it changed just in recent Windows versions. It does seem to work on my Win7 x64.

    The SQL*Plus behavior is not reasonable here, even though we can argue the setup is unexpected/unsupported (no official policy on this exists, as far as I can tell). If you spool the content, it will look correct. There must be some issue with the terminal output. It is worth investigating.


    -- Sergiusz
  • 9. Re: sqlplus on windows : Select cyrillic ( multi byte ) data. Strange output
    orafad Oracle ACE
    Currently Being Moderated
    Maybe I'm pushing away from sqlplus a bit too hard and part of this may be related to the old "sqlplusw" gui version on Windows...

    Anyway, in your testcase, if you spool output and open in e.g notepad, characters should be shown correctly.

    And as you said, if you add a (ascii or single byte) character as the first character, console output works.
    SQL> select longdescr,dump(longdescr,1016) dump from forumexample;
    
    LONGDESCR
    -----------------------------------------------------------------------------
    DUMP
    -----------------------------------------------------------------------------
    ��усский
    Typ=1 Len=14 CharacterSet=AL32UTF8: d0,a0,d1,83,d1,81,d1,81,d0,ba,d0,b8,d0,b9
    
    
    SQL> select '.'||longdescr dump from forumexample;
    
    DUMP
    -----------------------------------------------------------------------------
    .Русский
    As stated in the following thread, there seem to be a quirk in the Win C runtime that relates to noted behaviour:
    how to use sqlplus with utf8 on windows command line

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points