This discussion is archived
12 Replies Latest reply: Jun 29, 2007 1:34 PM by 348187 RSS

PreparedStatement, VARCHAR"(4000) and umlauts

484912 Newbie
Currently Being Moderated
Can someone explain this to me, please?

I have a varchar2 column of size 4000. Just big enough so I don't have to LOB it.I can insert a String of size 4000 no problem. But, if my String contains special characters, such as umlauts, I get an error that the String size is too big for the column. Surprises me, as my column type is char rather than byte. However, if I don't use prepared Statements the same String will insert OK.

Further, if I reduce the column size to 1, I can insert 'ä' using prepared statements.

Simple test code is below. Can someone explain to me what is going on? Thanks.



create table test_table (
name varchar2(4000 char);
)

public static void main(String[] args) {
final String url = "jdbc:oracle:thin:@host:port:DB";
final String username = "me";
final String password = "password";

String veryBigString = "";
for (int i = 0; i < 4000; i++) veryBigString = veryBigString + "a";
System.out.println("vbs length: " + veryBigString.length());

Connection connection = null;
try {
Class.forName("oracle.jdbc.driver.OracleDriver");
connection = DriverManager.getConnection(url, username, password);
Statement s = connection.createStatement();

int inserted = s.executeUpdate("insert into test_table values ('" + veryBigString + "')");
System.out.println("1: inserted " + inserted);

inserted = s.executeUpdate("insert into test_table values ('" + ("ä" + veryBigString.substring(1)) + "')");
System.out.println("2: inserted " + inserted);

PreparedStatement ps = connection.prepareStatement("insert into test_table values (?)");

ps.setString(1, veryBigString);
inserted = ps.executeUpdate();
System.out.println("3: inserted " + inserted);

ps.setString(1, "ä" + veryBigString.substring(1));
inserted = ps.executeUpdate();
System.out.println("4: inserted " + inserted);
}
catch (Exception e) {
e.printStackTrace();
}
finally {
if (connection != null) try {
connection.close();
}
catch (SQLException e) {
e.printStackTrace();
}
}
}
  • 1. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    castorp Explorer
    Currently Being Moderated
    You don't tell us what characterset your database has. I assume it is UTF8 (or another multibyte character set) by the description of your "problem".

    Most probably you have defined your VARCHAR column with "byte" semantics and thus VARCHAR2(4000) means it can store 4000 bytes. With an encoding like UTF8, a single character can use up to 3 bytes, thus the actual max. length in characters for such a column it approx. 1333.

    For more details read the SQL manual:
    http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements001.htm#i45694

    Btw: this has nothing to do with Java or JDBC
  • 2. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    484912 Newbie
    Currently Being Moderated
    Thanks for your reply.

    I'm aware of the difference between byte and char semantics. I definitely use char. With a sigh, and grumbling "I'm not some newbie", I quickly read the link you sent me and noticed "The maximum length of VARCHAR2 data is 4000 bytes". Aha! So, even if I define my column as varchar2(4000 char), I can still only fit up to 4K bytes in there? But that still doesn't explain what my sample program shows.

    With an "insert" executed using a JDBC Statement I can insert 4000 characters (including umlauted characters) into a VARCHAR2(4000 char) column. With a PreparedStatement I cannot. (Using Oracle's ojdbc14.jar, by the way.)

    So I definitely think my question has something to do with JDBC. It's easy enough to go around this problem, but there's obviously an important point I've missed some where.
  • 3. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    484912 Newbie
    Currently Being Moderated
    Oh yeah, the characterset is 'WE8ISO8859P9'.
  • 4. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    484912 Newbie
    Currently Being Moderated
    I suspected there may be characterset conversion issues with the PreparedStatement parameter.

    So, the relevant piece of documentation is in "JDBC and Globalization Support", Chapter 18, "Advnaced Topics", "SQL CHAR Data Size Restrictions with the Thin Driver".

    Basically, our DB encoding means that Strings are first converted to UTF-8 before being sent to the DB. The driver checks that the UTF8 length does not exceed the max possible column size.

    It performs the UTF8 conversion because our DB character set is not WE8ISO8859P1, rather WE8ISO8859P9. At the moment that doesn't say much to me.
  • 5. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    585295 Newbie
    Currently Being Moderated
    I'm noticing the exact same behaviour with PreparedStatement. We're using OJDBC 1.4 and ran into an issue where our one processes that is in development started throwing errors when importing certain records. When I debugged the issue I noticed it was failing on inserting strings with length greater then 1333 into a VARCHAR2(4000) column.

    If I convert this code from a PreparedStatement to just a normal Statement and generate the SQl dynamically in code, these large string values are inserted/updated without a problem. The issue only occurs when using PreparedStatement. I would use the Statement method, but it more then doubles the processing time of this data. It seems that when a high-value character is in a string, PreparedStatement treats each character in that string as 3-bytes.

    It's funny because the string that was failing is only 1544 characters so this has to be the case. I also verified this by trimming the string down in characters. Trimming from 1544 characters to 1333 characters works, however trimming to 1334 characters fails. Taking the string and sending the first 1300 characters works and so does sending characters from position 1300 to 1544 so it's not data related. I even tried to change from using the setString() to setCharacterStream() and that fails also.

    Any update on this issue would be highly appreciated.
  • 6. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    348187 Newbie
    Currently Being Moderated
    I am unable to reproduce your problem

    I took your code, I had to change the connect string but otherwise it is the same

    I was able to load all four rows with no problem

    My character set is US7ASCII for the database and for the client

    I suspect you might be having a character-set-conversion issue, you know, where the client character set does not match the db character set so it has to do a translation, sometimes it is called 'lossy' if the client char set is not a complete subset of the db char set

    So - I changed the NLS_LANG env var on the client to WEISO... (the one you mentioned in your mail), but it still worked

    So, then I changed the driver - I was using classes12.jar so I changed it to ojdbc14.jar, it still worked

    I am using 10.2, is that what you guys are using?

    One thing you might want to try - do a
    SELECT DUMP(NAME) FROM TEST_TABLE

    It will give you the byte-by-byte ascii info that was loaded into the table, that might give you some insight

    Maybe load just 1000 characters and see what it is putting in the db for the As as opposed to the umlaut-As

    If you look at the data I loaded, it put in the umlaut-A as a single byte
    (but - that makes sense for the US7ASCII character set)

    (I trimmed the output)

    SQL> select dump(name) from test_table;

    DUMP(NAME)
    --------------------------------------------------------------------------------
    Typ=1 Len=4000: 228,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,...
    Typ=1 Len=4000: 97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,9...
    Typ=1 Len=4000: 97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,9...
    Typ=1 Len=4000: 228,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,97,...


    See below, char 228 is the umlaut-A

    SQL> select chr(228) from dual;

    C
    -
    ä

    Here is the code, it is almost identical, I just put the user/pass in the connect string


    import java.sql.*;
    import java.math.*;
    import java.io.*;
    import oracle.jdbc.*;
    import java.util.*; // for the Properties

    class Varchar
    {
    public static void main(String[] args) {
    final String url = "jdbc:oracle:thin:user/pass@host:1521:db";

    String veryBigString = "";
    for (int i = 0; i < 4000; i++) veryBigString = veryBigString + "a";
    System.out.println("vbs length: " + veryBigString.length());

    Connection connection = null;
    try {
    Class.forName("oracle.jdbc.driver.OracleDriver");
    connection = DriverManager.getConnection(url);
    Statement s = connection.createStatement();

    int inserted = s.executeUpdate("insert into test_table values ('" + veryBigString + "')");
    System.out.println("1: inserted " + inserted);

    inserted = s.executeUpdate("insert into test_table values ('" + ("ä" + veryBigString.substring(1)) + "')");
    System.out.println("2: inserted " + inserted);

    PreparedStatement ps = connection.prepareStatement("insert into test_table values (?)");

    ps.setString(1, veryBigString);
    inserted = ps.executeUpdate();
    System.out.println("3: inserted " + inserted);

    ps.setString(1, "ä" + veryBigString.substring(1));
    inserted = ps.executeUpdate();
    System.out.println("4: inserted " + inserted);
    }
    catch (Exception e) {
    e.printStackTrace();
    }
    finally {
    if (connection != null) try {
    connection.close();
    }
    catch (SQLException e) {
    e.printStackTrace();
    }
    }
    }
    }
  • 7. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    585295 Newbie
    Currently Being Moderated
    I'm using OJDBC 1.4 for 10.1.2 however our database is 9.2.0.6. After debugging the issue more, it looks like my issue is slightly different then the orginal post. I'm suspecting there is an issue with OJDBC 10.1.2 talking to an older database and retrieving the DB Character set, so it goes into a failsafe mode and treats all characters as 3 bytes each. The reason we are using OJDBC for 10.1.2 is because it contains connection pooling, OJDBC for 9.2.0.x does not.
  • 8. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    348187 Newbie
    Currently Being Moderated
    Is your code similar to the sample code that was posted?
    If not can you post some code that will reproduce the problem?
    Did you try with bind variables instead of building it all in one big string?
    Can you load the 1300 chars and do a select dump(col) from tab to see what oracle is putting in the table?
  • 9. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    585295 Newbie
    Currently Being Moderated
    We are using Bind variables for the SQL. there are 7 bind variables. Our prepared statement looks like this:

    UPDATE <TABLENAME> SET <COL1> = ?, <COL2> = ?, <COL3> = ?, <COL4> = ?, <COL5> = ?, <COL6> = ? WHERE <COL7> = ?

    On the Oracle Database...

    <COL1> is defined NUMBER
    <COL2> is defined VARCHAR2(3)
    <COL3> is defined NUMBER
    <COL4> is defined VARCHAR2(1)
    <COL5> is defined VARCHAR2(1)
    <COL6> is defined VARCHAR2(4000)
    <COL7> is defined NUMBER

    pstmt.setString(1, NumVal);
    ...
    pstmt.setString(6, value);
    ...
    pstmt.executeUpdate();

    The other thing is we are updating this table through a DB LINK, the actual table is on a different Oracle DB server. Our DBA traced the session and the error is not comming from the database. COL6 is the one that is causing this issue. If the string (value) is greater then 1333 characters, it will generate an Exception. If I generate the SQL manually and use the Statement object to execute the SQL string, the update works fine.

    So I ended up doing a two step process, try to do the update using a PreparedStatement, if that generates an exception, I catch the Exception and do a rollback and then attempt to do the update using the Statement object.
  • 10. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    348187 Newbie
    Currently Being Moderated
    That is wierd

    Have you looked at the data? Do you always put in 4000 chars? I am thinking that if you have a special character in your text but it is <= 1333 chars, your prepared statement won't fail but it will put some funky data in the database

    What is the character set of
    -the database you connect to
    -the database that has the table in it
    -the client

    can you load it with data from the prepared statement (a short string) with a special character in it and then do the 'select dump(col6) from <table>', see what the results are?
  • 11. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    585295 Newbie
    Currently Being Moderated
    I did look at the data and it is actually all low characters, not one special character in the string. Our DB is using UTF8 and the machines this is happening on is my Windows XP Dev machine (XP Default) and our Red Hat server boxes so it doesn't seem to be client/server setting issue. Actually, I have yet hit a string over 2000 characters, however any string over 1333 characters fails so these strings are definitely being treated as 3 bytes per character (3 x 1333 = 3999), but only when using PreparedStatement. If using the Statement object, I can insert these strings without a problem.
  • 12. Re: PreparedStatement, VARCHAR"(4000) and umlauts
    348187 Newbie
    Currently Being Moderated
    I tried the test on a db with a AL16UTF16 character set

    This is the error I got, I put it below, is this the one you are getting?

    I only get it if I have the umlaut-A in the string

    I found a metalink note that might be relevant, did you look at this: 214868.1

    I get this with the 10.1 or the 10.2 classes12.jar

    Error:
    java.sql.SQLException: ORA-01461: can bind a LONG value only for insert into a LONG column
    at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:124)
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:304)
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:271)
    at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:625)
    at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:181)
    at oracle.jdbc.driver.T4CPreparedStatement.execute_for_rows(T4CPreparedStatement.java:784)
    at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1026)
    at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:2884)
    at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:2956)
    at Varchar.main(Varchar.java:49)