10 Replies Latest reply: Aug 27, 2012 9:53 PM by Sergiusz Wolicki-Oracle RSS

    ORA-22831 when using dbms_lob.substr  with a clob in an AL32UTF8 db

    user12175624
      Hi,

      we found some strange behaviour and errors when testing with clobs an AL32UTF8.

      setup:
      database version: Oracle 11.2.0.3
      characterset: AL32UTF8

      create a table consisting of a varchar2 and a clob, insert a string containing a four-byte character and use substring to get back each character, one after the other.

      it works as expected with the varchar2 column but gives unexpected results and even ORA-22831 with the clob column.

      As documented, oralce stores data in clobs in ucs-2 if the database character set is a multibyte charset - so far so good. But you should still be able to use substring (dbms_lob) as expected, shouldn't you?


      here the testcase:


      --create a table with a varchar2 column and a clob
      create table clob_test (varchar_test varchar2(100), clob_test clob);

      --insert a string, consisting of 'A' at both ends and a 4-byte character in the middle
      insert into clob_test (varchar_test,clob_test) values(unistr('A\D801\DC4FA'),unistr('A\D801\DC4FA'));

      --clob has lenght4 (why is this?), varchar 3 (as expected)
      select length(varchar_test),length(clob_test) from clob_test;
      LENGTH(VARCHAR_TEST) LENGTH(CLOB_TEST)
      -------------------- -----------------
      3 4


      --dump data of varchar column in hex format: ok
      select dump(varchar_test,1016) from clob_test;
      DUMP(VARCHAR_TEST,1016)
      --------------------------------------------------------------------------------
      Typ=1 Len=6 CharacterSet=AL32UTF8: 41,f0,90,91,8f,41

      --dump data of clob column in hex format: ok, identical to varchar2 column
      select dump(dbms_lob.substr(clob_test,40,1),1016) from clob_test;
      DUMP(DBMS_LOB.SUBSTR(CLOB_TEST,40,1),1016)
      --------------------------------------------------------------------------------
      Typ=1 Len=6 CharacterSet=AL32UTF8: 41,f0,90,91,8f,41


      --select first character of varchar column: works, shows 'A'
      select substr(varchar_test,1,1) from clob_test;
      SUBS
      ----
      A

      --select first character of varchar column: works, but character cannot be displayed
      select substr(varchar_test,2,1) from clob_test;
      SUBS
      ----
      ?

      --select first character of varchar column: works, shows 'A'
      select substr(varchar_test,3,1) from clob_test;
      SUBS
      ----
      A


      --select first character of clob column: works, shows 'A'
      select dbms_lob.substr(clob_test,1,1) from clob_test;
      DBMS_LOB.SUBSTR(CLOB_TEST,1,1)
      --------------------------------------------------------------------------------
      A

      --select second character of clob column: shows the replacement character?!
      select dbms_lob.substr(clob_test,1,2) from clob_test;
      DBMS_LOB.SUBSTR(CLOB_TEST,1,2)
      --------------------------------------------------------------------------------
      ?

      --select third character of clob column: ORA-22831 is thrown
      select dbms_lob.substr(clob_test,1,3) from clob_test;
      select dbms_lob.substr(clob_test,1,3) from clob_test
      *
      FEHLER in Zeile 1:
      ORA-22831: Offset oder Offset+Menge fällt nicht auf vollständiges Zeichen
      ORA-06512: in "SYS.DBMS_LOB", Zeile 1092
      ORA-06512: in Zeile 1

      --select fourth character of clob column: works, shows 'A'
      select dbms_lob.substr(clob_test,1,4) from clob_test;

      DBMS_LOB.SUBSTR(CLOB_TEST,1,4)
      --------------------------------------------------------------------------------
      A


      This must be bug - am I wrong? If not - how can one know at what position in a clob such a two-byte character is??


      Thanks for any response and feedback,

      Werner