This discussion is archived
3 Replies Latest reply: Sep 29, 2009 3:10 AM by 509850 RSS

JDBC Connection Reset when using many processes on 64 bit system

716218 Newbie
Currently Being Moderated
Hi,

we've a annoying JDBC connection problem since we migrated our Java server to a 64 bit operating system. Here our environment.

Database Machine:
Oracle 10g
Linux 32 Bit (but same problem on 64 Bit)

Application Servers Machine:
JDBC driver 11.1.0.6
SUN Java 1.6.0_06 64bit
Linux 64 bit (SLES 10 SP2)

We have 6 different Java server processes (but with the same code) which all create some connections to the same database (running on a different Hardware). All 6 Java server processes starting at the same time (via scripts).

Everything was fine, until we migrated the application server machine from 32 bit Linux to 64 bit Linux. From this day on, the half (or one more or less) of our application server processes can't longer connect to the database. The application server processes which have the problem product the following stack trace:

java.sql.SQLRecoverableException: I/O Exception: Connection reset
at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:281)
at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:118)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:224)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:296)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:611)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:455)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:494)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:199)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:30)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:503)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:154)
at com.aaaa.utils.db.DbConnectionPool.<init>(DbConnectionPool.java:130)
...

It looks like a network problem with the system but all other network stuff works without problems, between the two machines.

- We use the thin JDBC driver (no OCI)
- No firewalls are active on both systems
- Both systems are in the same subnet connected to the same switch
- The DNS configuration on both systems are ok (forward and reverse)
- We've found the same problem on different application-server/database-server pairs with 64 bit application server hardware - but not all of our 64 bit server systems have this problem.
- When running application server process and database on the same system (connecting via localhost) the problem does not longer appear.
- The same database machine connected from a 32 bit application server (with 6 different java processes starting at the same time) works without a problem.

We've tried a lot of things to isolate the problem - but with no success.

- Same problem with SUN Java 1.6.0_06 32 bit (on 64 bit Linux)
- Same problem with SUN Java 1.6.0_15 (32 and 64 bit)
- Played with some JDBC connection properties (oracle.jdbc.TcpNoDelay, oracle.jdbc.ReadTimeout, oracle.net.CONNECT_TIMEOUT, oracle.net.disableOob, oracle.jdbc.RetainV9LongBindBehavior, oracle.jdbc.StreamChunkSize) without a positive result.
- We've updated Linux network driver
- We've changed to an completeky other NIC
- We've tried an other Linux 64 distribution
- We've increased the PROCESSES parameter in the init.ora
- We've tried the JDBC driver 11.1.0.6
- We've tried the _g version of the JDBC driver, but the debugging output simply tell us "Connection Reset" without a hint why.
- We've tried a more complex JDBC connect string (

"jdbc:oracle:thin:@(DESCRIPTION=" +
"(ADDRESS_LIST=" +
"(ADDRESS=(PROTOCOL=TCP)" +
"(HOST=host)" + =
"(PORT=port)" +
")" +
")" +
"(CONNECT_DATA=" +
"(SERVICE_NAME=sid)" +
"(SERVER=DEDICATED)" +
")" +
")"

Nothing of this things helped us to isolate the problem.

When we start our application server processes with a long pause (>1 min) between every process start. The problem does not occure. When we start only one application server with the same number of connections as the 6 different application server processes, everything works fine.

We have absolute no idea why
- this only occures on 64 bit Linux
- independent if it's a 32 bit or 64 bit JVM
- does not occure on all 64 bit application server machines / database machine pairs
- never occure on the same 64 bit app server hardware when using a 32 bit Linux
- using the Oracle JDBC 10g driver (10.xxx) there is no problem (but because of other issues, we need to use the JDBC 11g driver)

Does anybody has an idea what our problem is?

Thanks in advance,

greetings
  • 1. Re: JDBC Connection Reset when using many processes on 64 bit system
    Joe Weinstein Expert
    Currently Being Moderated
    Does this happen for every connection request, or if you throttle the
    connection requests to a couple per second, does it work better?
    The only thing I'm hinting at, is that the DBMS sometimes can't
    handle lots of near-simultaneous connection requests. If it happens
    with each request, every time, then nevermind me....
  • 2. Re: JDBC Connection Reset when using many processes on 64 bit system
    717782 Newbie
    Currently Being Moderated
    I was recently struggling with this exact same problem. I opened a ticket with Oracle and this is what they told me.

    java.security.SecureRandom is a standard API provided by sun. Among various methods offered by this class void
    nextBytes(byte[])
    is one. This method is used for generating random bytes. Oracle 11g JDBC drivers use this API to generate random number during
    login. Users using Linux have been encountering SQLException("Io exception: Connection
    reset").

    The problem is two fold

    1. The JVM tries to list all the files in the /tmp (or alternate tmp directory set by -Djava.io.tmpdir) when
    SecureRandom.nextBytes(byte[]) is invoked. If the number of files is large the
    method takes a long time
    to respond and hence cause the server to timeout

    2. The method void nextBytes(byte[]) uses /dev/random on Linux and on some machines which lack the random
    number generating hardware the operation slows down to the extent of bringing the whole login process to
    a halt. Ultimately the the user encounters SQLException("Io exception:
    Connection reset")

    Users upgrading to 11g can encounter this issue if the underlying OS is Linux which is running on a faulty hardware.

    Cause
    The cause of this has not yet been determined exactly. It could either be a problem in
    your hardware or the fact
    that for some reason the software cannot read from dev/random


    Solution
    Change the setup for your application, so you add the next parameter to the java command:

    -Djava.security.egd=file:/dev/../dev/urandom



    We made this change in our java.security file and it has gotten rid of the error.
  • 3. Re: JDBC Connection Reset when using many processes on 64 bit system
    509850 Newbie
    Currently Being Moderated
    Thanks for the random hint.

    For me, this worked fine:

    System.setProperty("java.security.egd", "file:///dev/urandom");  // the 3 '/' are important to make it an URL


    Also I tried:

    rm /dev/random
    ln -s /dev/urandom /dev/random

    ...which worked but it is gone after every reboot.

    Here is also a good reading:
    Link: [http://www.usn-it.de/index.php/2009/02/20/oracle-11g-jdbc-driver-hangs-blocked-by-devrandom-entropy-pool-empty/]

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points