9 Replies Latest reply on May 9, 2008 2:31 AM by 843859

    Intermittent SQL Connection problem - Unknown Host

    843859

      Hi All,
      I have a bit of a puzzler problem that you might be able to give me some hints to help debug.

      Some background first - we have 2 distinct hosted sites, each site is protected by firewalls etc. but they are connected via a dedicated secure gigabit link. Each site has 2 application servers that are running Apache2 which are connected to tomcat 5.5 through mod_jk and multiple workers.

      The application servers in turn connect to Oracle RAC clusters (one in each hosted site) through oracle drivers using a TNS entry - the TNS entry has load balanced entries to access the multiple RAC VIPS.

      The application uses Oracle's JDBC connections to connect to the database. The system is designed so that the front-end application can connect to any application server (which use session beans to maintain sessions) in a round-robin configuration.

      The problem we are having is an intermittent one where we lose connectivity and get a Java exception (see below). The apache/tomcat/oracle configuration is identicial across all the systems so we have eliminated a configuration problem (we hope!). I've looked around on the forums and I can find other similar problems - but they always have an associated Oracle error - this one doesn't.

      I'm out of ideas of what to check - or what other debug we can enable to get to the bottom of this problem. Any ideas most welcome.

      Exception -
      SEVERE: action: An Unknown Exception occurred
      org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Io exception: Unknown host specified )
      at org.apache.tomcat.dbcp.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:855)
      at org.apache.tomcat.dbcp.dbcp.BasicDataSource.getConnection(BasicDataSource.java:540)
      at uk.ac.hefce.sql.QueryUtil.getConnection(QueryUtil.java:51)
      at uk.ac.hefce.cms.actions.HefceObjUtil.getConnection(HefceObjUtil.java:46)
      at uk.ac.hefce.cms.actions.InitialiseAction.execute(InitialiseAction.java:36)
      at uk.ac.hefce.cms.actions.LoginAction.execute(LoginAction.java:29)
      at org.apache.struts.chain.commands.servlet.ExecuteAction.execute(ExecuteAction.java:53)
      at org.apache.struts.chain.commands.AbstractExecuteAction.execute(AbstractExecuteAction.java:64)
      at org.apache.struts.chain.commands.ActionCommandBase.execute(ActionCommandBase.java:48)
      at org.apache.commons.chain.impl.ChainBase.execute(ChainBase.java:190)
      at org.apache.commons.chain.generic.LookupCommand.execute(LookupCommand.java:304)
      at org.apache.commons.chain.impl.ChainBase.execute(ChainBase.java:190)
      at org.apache.struts.chain.ComposableRequestProcessor.process(ComposableRequestProcessor.java:280)
      at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1858)
      at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:459)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
      at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
      at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
      at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:199)
      at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:282)
      at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:767)
      at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:697)
      at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:889)
      at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
      at java.lang.Thread.run(Unknown Source)
      Caused by: java.sql.SQLException: Io exception: Unknown host specified
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
      at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:255)
      at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:387)
      at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:420)
      at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:165)
      at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:35)
      at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:801)
      at org.apache.tomcat.dbcp.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:37)
      at org.apache.tomcat.dbcp.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:290)
      at org.apache.tomcat.dbcp.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:877)
      at org.apache.tomcat.dbcp.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:851)
      ... 31 more

      The configuration of the context resource is:
      <Resource
      name="jdbc/tqicms" type="javax.sql.DataSource"
      username="<USERNAME>"
      password="<PASSWORD>"
      driverClassName="oracle.jdbc.driver.OracleDriver"
      url="jdbc:oracle:oci:@<TNSNAME>"
      maxActive="8"
      maxIdle="10"
      maxWait="5000"
      validationQuery="select 1 from dual"
      removeAbandoned="true"
      removeAbandonedTimeout="60"/>

      The configuration of the TNS entry is
      <TNSNAME> =
      (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = <ip address2>)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = <ip address1>)(PORT = 1521))
      (LOAD_BALANCE = yes)
      (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = <servicename>)
      )
      )

      Thanks
      Neil

      Message was edited by:
      ncatton

        • 1. Re: Intermittent SQL Connection problem - Unknown Host
          dcminter
          Sounds like you may have intermittently flakey nameservers. Try putting the name into the hosts file so that it will resolve regardless. If the problem goes away QED.

          (edit)
          • 2. Re: Intermittent SQL Connection problem - Unknown Host
            843859
            we've checked the network layer and it seems to be ok - all the servers in the entire infrastructure are included in every /etc/hosts file, we also use the IP address in the TNS connection.

            Regards
            Neil
            • 3. Re: Intermittent SQL Connection problem - Unknown Host
              dcminter
              Ok. Why are you bringing the server up at this point? Is it in response to anything? Because that's the only clue I see at the moment; the app can't resolve the database server name for some reason, and it's while the pool is being populated initially.
              • 4. Re: Intermittent SQL Connection problem - Unknown Host
                843859
                Hi, we had that thought as well - but all the components are up and running and it's when the web application tries to connect it fails. But if we then hit login again immediatly it works ok. We are currently getting the development team to add more debug into the application - so far we have narrowed it down to something in this piece of code...

                public Connection getConnection() throws IOException, SQLException {
                          Connection conn=null;
                          Context ctx = null;
                          try {
                               ctx = new InitialContext();
                               Context envCtx = (Context) ctx.lookup("java:comp/env");
                               DataSource ds=(DataSource)envCtx.lookup("jdbc/<NAME>");
                               conn=ds.getConnection();
                               conn.setAutoCommit( false );
                          } catch (NamingException e) {
                               e.printStackTrace();
                               throw new FailedConnectionException();
                          }
                          return conn;
                     }

                The dev team are now putting debug statements into each call to see if we can get more info.

                Regards
                Neil
                • 5. Re: Intermittent SQL Connection problem - Unknown Host
                  dcminter
                  I wouldn't bother. That's just acquiring a connection from the pool. The problem is happening when the pool tries to establish a connection to the database.

                  All I can think of is that for some reason your first attempt to resolve the name is failing and that it then succeeds thereafter - I couldn't give you a plausible explanation for that, but it's what you're describing.

                  Either way, this isn't a bug in that block of code - it's maybe an app server configuration issue (write some code to try and do this from a stand-alone application numerous times to see what happens there) or, more likely, some sort of system configuration issue to do with name resolution.
                  • 6. Re: Intermittent SQL Connection problem - Unknown Host
                    dcminter
                    Additional thought - use the fully qualified host name of the server you're trying to connect to, and also maybe try using the IP if that doesn't work. I'd be VERY surprised if using the IP resulted in the same error, but either way it's indicative.
                    • 7. Re: Intermittent SQL Connection problem - Unknown Host
                      843859
                      bit more info - the debug is showing it is hanging on the conn=ds.getConnection();

                      We have tried the tns with both a fully qualified hostname and the IP addresses of the cluster vips.

                      When it hangs it consistently takes 30 mins before it timesout and then it returns successfully and continues through the rest of the java code - but the browser has given up by then.

                      Our plan is to now remove tns by having the connection string in the code.

                      any other thoughts / ideas most welcome.
                      • 8. Re: Intermittent SQL Connection problem - Unknown Host
                        dcminter
                        bit more info - the debug is showing it is hanging on
                        the conn=ds.getConnection();
                        That's because that's the point at which it needs to materialize a connection from the pool. So that's the point it's connecting.

                        (edit) Incidentally, you didn't need additional debugging to determine this - you already know from the stack trace:
                        uk.ac.hefce.sql.QueryUtil.getConnection(QueryUtil.java:51)
                        When you used the IP address instead of the name, did you get EXACTLY the same error message? If not post it please.
                        • 9. Re: Intermittent SQL Connection problem - Unknown Host
                          843859
                          you may try this way, do not use IP address directly in jdbc url, try to use host name, modify your hosts file on client machine.