Forum Stats

  • 3,769,694 Users
  • 2,253,008 Discussions
  • 7,875,154 Comments

Discussions

Character encoding in Oracle Service Bus

3376635
3376635 Member Posts: 9
edited Jan 6, 2020 6:59AM in SOA & Process Management

Hi, I am trying to get the first 2000 characters of a specific field in a form to use it for creating a new form via ODI-OSB. In ODI, the string was successfully trimmed down to VARCHAR2 (2000) and was stored in the database. However, when OSB tries to create the form it returns an error saying that the field is more than 2000 characters. Currently we are using utf-8 in the XQuery. Any suggestions?

Tagged:
3376635
«1

Answers

  • Martien van den Akker
    Martien van den Akker Member Posts: 2,776 Bronze Crown
    edited Nov 8, 2019 6:46AM

    Do you mean that you try to save a string of 2000 chars from OSB into this varchar column in the database?

    If so, then check the database table, using SQL Developer. Is the column defined as varchar2(2000 char) or varchar2(2000 byte)?

    UTF-8 encodes the string with upto 4 bytes per character (https://en.wikipedia.org/wiki/UTF-8 ). So a UTF-8 string of 2000 characters may need more than 2000 bytes.

    If your database has UTF-8 as character set and you define the column as varchar2(2000 char), it should support the variable encoding.

    Regards,
    Martien

  • 3376635
    3376635 Member Posts: 9
    edited Nov 11, 2019 1:24AM

    Thanks Martien. I have consulted with my colleague on checking the column field defined however we cannot check if it is varchar2(2000 char) or varchar2(2000 byte). Only varchar2(2000) was displayed. Do you have a suggestion on how to check this one?

    Also for the character set in the database, we have AL32UTF8 and AL16UTF16.

    Thanks.characterset.png

  • Martien van den Akker
    Martien van den Akker Member Posts: 2,776 Bronze Crown
    edited Nov 11, 2019 2:55AM

    Hi,

    You can see it nicely in SQLDeveloper:

    pastedImage_0.png

    I've looked for how to see it in the database. It's in the user_tab_cols:

    select table_name, column_name, data_type, data_length, char_usedfrom user_tab_colswhere table_name = 'MY_TABLE'

    It's the char_used column, it could by 'B' for 'BYTES' and 'C' for 'CHAR'. (See also https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_2094.htm )

    pastedImage_10.png

    The default is BYTES, so it's likely that at your column bytes is used. But it can be changed to CHAR quite easily.

    Kind regards,

    Martien

    3376635
  • 3376635
    3376635 Member Posts: 9
    edited Nov 11, 2019 10:31PM

    Hi Martien,

    Please see details below on the char_used -

    OSB

    pastedImage_2.png

    Source table
    pastedImage_4.png

    Thank you.

  • Martien van den Akker
    Martien van den Akker Member Posts: 2,776 Bronze Crown
    edited Nov 12, 2019 2:59AM

    Hi,

    They're both Bytes apparently. That means that if there are no transformations/enrichments the source value should fit into the target. But it does not take into account the variance in number of bytes per character. So theoretically, it still could be that you have 2000 characters where the average has more than 2 bytes, it could not fit in 4000 bytes.

    You also talk about first 2000 chars of a string and this column is 4000.

    Could you show a bit more or your code (Xquery), source value/target value and most of all: the exact exception?

    Kind regards,
    Martien

  • 3376635
    3376635 Member Posts: 9
    edited Nov 13, 2019 12:23AM

    Hi Martien,

    For now, we can provide this information and will update you soon.

    The screenshot below is from a form and was stored in one of the source table.

    pastedImage_0.png

    When this was processed by OSB to create another form, the log table returned an error which states that comments field should not be more than 2000 characters.

    As per checking on the BLOB, this happened on the comments field -

    pastedImage_1.png

    Thank you.

  • Martien van den Akker
    Martien van den Akker Member Posts: 2,776 Bronze Crown
    edited Nov 13, 2019 2:16AM

    Hi,

    I see three times 'TEST25' + 'END' in the comments field (with a chariage return ascii(13) for each line). In your comments message I see it 6 times 'TEST25', but no 'END'. These aren't the complete text I presume. Is the text in reality exactly the same, or is there some transformation code that duplicates the lines unintendly?

    What are the exact string lengths?

    Kind regards,
    Martien

  • 3376635
    3376635 Member Posts: 9
    edited Nov 25, 2019 12:59AM

    Hi @Martien van den Akker, apologies on the very late response. Please see comments below -

    Q: I see three times 'TEST25' + 'END' in the comments field (with a chariage return ascii(13) for each line). In your comments message I see it 6 times 'TEST25', but no 'END'. These aren't the complete text I presume.

    A: For the first screenshot which has an END, this is from the BP form and the second screenshot is from the log table wherein the nextline character has been transformed causing the bottom/last characters in the form being cut.

    Below is the sample line of codes in the XQuery, please note I have removed other variables and renamed others also -

    xquery version "1.0" encoding "utf-8";

    (:: OracleAnnotationVersion "1.0" ::)

    declare namespace ns2="NS2";

    (:: import schema at "../Schema/SCHEMA2.xsd" ::)

    declare namespace ns1="NS1";

    (:: import schema at "../Schema/SCHEMA1.xsd" ::)

    declare variable $variableCreateFORM as element() (:: schema-element(ns1:VOWCCollection) ::) external;

    declare function local:func($variableCreateFORM as element() (:: schema-element(ns1:VOWCCollection) ::)) as element() (:: schema-element(ns2:createBPRecord) ::) {

                <ns2:createBPRecord>

                <shortname>{dvmtr:lookup("shortname")}</shortname>

                <authcode>{dvmtr:lookup("authcode")}</authcode>

                <BPXML>

                <List_wrapper>

                <_bp>

                   <status>Completed</status>

                   <Comments>{fn:data($variableCreateFORM/ns1:UpdateUnifierVowc/ns1:comments)}</Comments>

                </_bp></List_wrapper>

                </BPXML>

            </ns2:createBPRecord>

    };

    local:func($variableCreateFORM)

    Let me know if you have other questions.

  • Martien van den Akker
    Martien van den Akker Member Posts: 2,776 Bronze Crown
    edited Nov 25, 2019 3:56AM

    Unfortunately it is quite unclear what happens here.
    What I would do is try to intercept (log) the message before and after the xquery. And then try to test the xquery from JDeveloper.
    Could you post those messages before and after?

    By the way, is 'variableCreateFORM' or the child element in fact  UpdateUnifierVowc a repetative/iterative element? Are there more occurences of that element in the input message with comments? In that case the transform to text by fn:data() might not be what you expect.

    Kind regards,
    Martien

  • 3376635
    3376635 Member Posts: 9
    edited Nov 26, 2019 1:17AM

    Hi Martien, I have attached sample log file for this issue. Thank you.