3 Replies Latest reply: Sep 29, 2009 5:10 AM by 509850 RSS

    JDBC Connection Reset when using many processes on 64 bit system

    716218

      Hi,

      we've a annoying JDBC connection problem since we migrated our Java server to a 64 bit operating system. Here our environment.

      Database Machine:
      Oracle 10g
      Linux 32 Bit (but same problem on 64 Bit)

      Application Servers Machine:
      JDBC driver 11.1.0.6
      SUN Java 1.6.0_06 64bit
      Linux 64 bit (SLES 10 SP2)

      We have 6 different Java server processes (but with the same code) which all create some connections to the same database (running on a different Hardware). All 6 Java server processes starting at the same time (via scripts).

      Everything was fine, until we migrated the application server machine from 32 bit Linux to 64 bit Linux. From this day on, the half (or one more or less) of our application server processes can't longer connect to the database. The application server processes which have the problem product the following stack trace:

      java.sql.SQLRecoverableException: I/O Exception: Connection reset
      at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:281)
      at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:118)
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:224)
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:296)
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:611)
      at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:455)
      at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:494)
      at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:199)
      at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:30)
      at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:503)
      at java.sql.DriverManager.getConnection(DriverManager.java:582)
      at java.sql.DriverManager.getConnection(DriverManager.java:154)
      at com.aaaa.utils.db.DbConnectionPool.<init>(DbConnectionPool.java:130)
      ...

      It looks like a network problem with the system but all other network stuff works without problems, between the two machines.

      - We use the thin JDBC driver (no OCI)
      - No firewalls are active on both systems
      - Both systems are in the same subnet connected to the same switch
      - The DNS configuration on both systems are ok (forward and reverse)
      - We've found the same problem on different application-server/database-server pairs with 64 bit application server hardware - but not all of our 64 bit server systems have this problem.
      - When running application server process and database on the same system (connecting via localhost) the problem does not longer appear.
      - The same database machine connected from a 32 bit application server (with 6 different java processes starting at the same time) works without a problem.

      We've tried a lot of things to isolate the problem - but with no success.

      - Same problem with SUN Java 1.6.0_06 32 bit (on 64 bit Linux)
      - Same problem with SUN Java 1.6.0_15 (32 and 64 bit)
      - Played with some JDBC connection properties (oracle.jdbc.TcpNoDelay, oracle.jdbc.ReadTimeout, oracle.net.CONNECT_TIMEOUT, oracle.net.disableOob, oracle.jdbc.RetainV9LongBindBehavior, oracle.jdbc.StreamChunkSize) without a positive result.
      - We've updated Linux network driver
      - We've changed to an completeky other NIC
      - We've tried an other Linux 64 distribution
      - We've increased the PROCESSES parameter in the init.ora
      - We've tried the JDBC driver 11.1.0.6
      - We've tried the _g version of the JDBC driver, but the debugging output simply tell us "Connection Reset" without a hint why.
      - We've tried a more complex JDBC connect string (

      "jdbc:oracle:thin:@(DESCRIPTION=" +
      "(ADDRESS_LIST=" +
      "(ADDRESS=(PROTOCOL=TCP)" +
      "(HOST=host)" + =
      "(PORT=port)" +
      ")" +
      ")" +
      "(CONNECT_DATA=" +
      "(SERVICE_NAME=sid)" +
      "(SERVER=DEDICATED)" +
      ")" +
      ")"

      Nothing of this things helped us to isolate the problem.

      When we start our application server processes with a long pause (>1 min) between every process start. The problem does not occure. When we start only one application server with the same number of connections as the 6 different application server processes, everything works fine.

      We have absolute no idea why
      - this only occures on 64 bit Linux
      - independent if it's a 32 bit or 64 bit JVM
      - does not occure on all 64 bit application server machines / database machine pairs
      - never occure on the same 64 bit app server hardware when using a 32 bit Linux
      - using the Oracle JDBC 10g driver (10.xxx) there is no problem (but because of other issues, we need to use the JDBC 11g driver)

      Does anybody has an idea what our problem is?

      Thanks in advance,

      greetings

        • 1. Re: JDBC Connection Reset when using many processes on 64 bit system
          Joe Weinstein-Oracle
          Does this happen for every connection request, or if you throttle the
          connection requests to a couple per second, does it work better?
          The only thing I'm hinting at, is that the DBMS sometimes can't
          handle lots of near-simultaneous connection requests. If it happens
          with each request, every time, then nevermind me....
          • 2. Re: JDBC Connection Reset when using many processes on 64 bit system
            717782
            I was recently struggling with this exact same problem. I opened a ticket with Oracle and this is what they told me.

            java.security.SecureRandom is a standard API provided by sun. Among various methods offered by this class void
            nextBytes(byte[])
            is one. This method is used for generating random bytes. Oracle 11g JDBC drivers use this API to generate random number during
            login. Users using Linux have been encountering SQLException("Io exception: Connection
            reset").

            The problem is two fold

            1. The JVM tries to list all the files in the /tmp (or alternate tmp directory set by -Djava.io.tmpdir) when
            SecureRandom.nextBytes(byte[]) is invoked. If the number of files is large the
            method takes a long time
            to respond and hence cause the server to timeout

            2. The method void nextBytes(byte[]) uses /dev/random on Linux and on some machines which lack the random
            number generating hardware the operation slows down to the extent of bringing the whole login process to
            a halt. Ultimately the the user encounters SQLException("Io exception:
            Connection reset")

            Users upgrading to 11g can encounter this issue if the underlying OS is Linux which is running on a faulty hardware.

            Cause
            The cause of this has not yet been determined exactly. It could either be a problem in
            your hardware or the fact
            that for some reason the software cannot read from dev/random


            Solution
            Change the setup for your application, so you add the next parameter to the java command:

            -Djava.security.egd=file:/dev/../dev/urandom



            We made this change in our java.security file and it has gotten rid of the error.
            • 3. Re: JDBC Connection Reset when using many processes on 64 bit system
              509850
              Thanks for the random hint.

              For me, this worked fine:

              System.setProperty("java.security.egd", "file:///dev/urandom");  // the 3 '/' are important to make it an URL


              Also I tried:

              rm /dev/random
              ln -s /dev/urandom /dev/random

              ...which worked but it is gone after every reboot.

              Here is also a good reading:
              Link: [http://www.usn-it.de/index.php/2009/02/20/oracle-11g-jdbc-driver-hangs-blocked-by-devrandom-entropy-pool-empty/]