11 Replies Latest reply on Oct 1, 2012 11:31 AM by Udo

    Connection Pool Problem

    915154
      I'm running APEX Listenerversion 1.1.3.243.11.40 on WebLogic 11g. The database is Oracle 11.2.0.3

      I've got two managed instances in a cluster and the listener is deployed to the cluster. I've individually configured both APEX listeners (by going directly to each managed instance) to point to the same database.

      I've configured the connection Pool to as follows on both managed instances:

      <entry key="apex.jdbc.DriverType">thin</entry>
      <entry key="apex.jdbc.InitialLimit">10</entry>
      <entry key="apex.jdbc.MinLimit">10</entry>
      <entry key="apex.jdbc.MaxLimit">20</entry>

      When the server starts up I see the initial 10 connection no problems. After a while I see the following error message multiple times.

      --- 5/15/12 11:44 PM ( 8 Hours 26 Minutes 44 Seconds Ago ) --- Unable to start the Universal Connection Pool: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource

      netstat shows connections dropped below the min limit, sometimes to zero connections.

      Has anybody seen this before? Also, is there a way to turn on detailed debugging or trace for the listener.I've turned on Show debug messages on console from listenerAdmin but that is really not detailed enough to see what is ultimately causing the Connection Pool problem.
        • 1. Re: Connection Pool Problem
          915154
          On more thing I forgot to include. I also see the following messages as well after the connection pool issue:

          Datasource
          --- 5/15/12 3:24 PM ( 6 Hours 14 Minutes 19 Seconds Ago ) ---
          HTTP Status Code: 404 Requested url http://hqw0cdi0rxxxx01.dev.xxxxx:7077/bcvs/wwv_flow.accept is not found.
          --- 5/15/12 1:28 PM ( 8 Hours 10 Minutes 31 Seconds Ago ) ---
          HTTP Status Code: 404 Requested url http://hqw0cdi0rxxx04.dev.xxxx:7077/bcvs/wwv_flow.show is not found.


          Now why would wwv_flow related items not be available. Could this be related to VPD and not setting the context of the user within the connection pool?

          Edited by: rowdycholin on May 16, 2012 6:47 AM
          • 2. Re: Connection Pool Problem
            Udo
            Hello,

            I'll try to answer you going backwards through your post...
            Also, is there a way to turn on detailed debugging or trace for the listener.I've turned on Show debug messages on console from listenerAdmin but that is really not detailed enough to see what is ultimately causing the Connection Pool problem.
            It could be interesting to see a bit of the stack trace you get there anyway. As far as I know that's the most you can get from APEX Listener. If you don't see anything in there, you could start tracing the database it's connecting to.
            netstat shows connections dropped below the min limit, sometimes to zero connections.
            This seems to be an interesting perception: The connections have dropped... Do you have any timeout in your network that could cause these connections to be closed without notice? Do you see anything that could relate to this in your database alert log?
            When the server starts up I see the initial 10 connection no problems. After a while I see the following error message multiple times.
            How much time is "after a while"?
            I've configured the connection Pool to as follows on both managed instances:


            <entry key="apex.jdbc.DriverType">thin</entry>
            <entry key="apex.jdbc.InitialLimit">10</entry>
            <entry key="apex.jdbc.MinLimit">10</entry>
            <entry key="apex.jdbc.MaxLimit">20</entry>
            How about the other jdbc parameters? (InactivityTimeout, AbandonedConnectionTimeout)?
            I've got two managed instances in a cluster and the listener is deployed to the cluster. I've individually configured both APEX listeners (by going directly to each managed instance) to point to the same database.
            So you've got a clustered app server pointing to a non-clustered database?
            Did you try to run another APEX Listener standalone and see if its connection pool becomes empty as well?
            I'm running APEX Listenerversion 1.1.3.243.11.40 on WebLogic 11g. The database is Oracle 11.2.0.3.
            What OS are you running your WLS on?

            You could try upgrading the JDBC and UCP drivers used in APEX Listener to 11.2.0.3 as APEX Listener shipped with 11.2.0.2. Though not necessarily needed for the connections to a 11.2.0.3 database there have been several connection related bug fixes in these libraries. Perhaps one of these is causing your problem. I posted a short description on how to patch these libraries into the current APEX Listener release here: {message:id=10201844}

            -Udo
            • 3. Re: Connection Pool Problem
              915154
              Hopefully, I've answered your questions in no particular order.

              OS on WebLogic is Redhat Enterprise 5.6 (kernel 2.6.18-274.12.1.el5)

              Yesterday I tried deploying to only one server in the cluster (not to cluster) and shutdown the other server to see if this resolved the errors. It did not.

              I don't believe we have anything in the network that would drop the connection like that. It's not going through a load balancer or anything like that, at least that I know of. I'm doing a test as we speak to ssh from the app server to the DB server to see if I get knocked off. I'd rather sqlplus but the thick client is not installed on the app server. I am also doing a test of sqlplus from another server that is on the same network to the DB server to see if that gets knocked off.

              Inactivity timeout is 1800 and Abandon Connections is 900. I'm assuming that if inactiviity is the issue (I don't think it is), it would only bring it down the Minimum and not down to zero connections.


              I think I've seen the steps on patching. It doesn't look too hard.


              Here is the stack trace from WebLogic. I cut out a lot of the rest of it. It seems to go on forever.

              <May 15, 2012 11:44:22 PM EDT> <Error> <oracle.dbtools.apex.OWA> <BEA-000000> <null
              java.sql.SQLException: Unable to start the Universal Connection Pool: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource
              at oracle.ucp.util.UCPErrorHandler.newSQLException(UCPErrorHandler.java:541)
              at oracle.ucp.jdbc.PoolDataSourceImpl.throwSQLException(PoolDataSourceImpl.java:588)
              at oracle.ucp.jdbc.PoolDataSourceImpl.startPool(PoolDataSourceImpl.java:277)
              at oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:647)
              at oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:614)
              at oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:608)
              at oracle.dbtools.apex.Connections.getConnection(Connections.java:116)
              at oracle.dbtools.apex.ModApexContext.getConnection(ModApexContext.java:310)
              at oracle.dbtools.apex.Procedure.getProcedure(Procedure.java:43)
              at oracle.dbtools.apex.OWA.validateProcedure(OWA.java:317)
              at oracle.dbtools.apex.security.Security.isValidRequest(Security.java:186)
              at oracle.dbtools.apex.ModApex.validateRequest(ModApex.java:287)
              at oracle.dbtools.apex.ModApex.doGet(ModApex.java:73)
              at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
              at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
              at oracle.dbtools.rt.web.HttpEndpointBase.defaultServlet(HttpEndpointBase.java:133)
              at oracle.dbtools.rt.web.HttpEndpointBase.service(HttpEndpointBase.java:90)
              at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
              at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
              at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
              at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:300)
              at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
              at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
              at oracle.security.wls.filter.SSOSessionSynchronizationFilter.doFilter(SSOSessionSynchronizationFilter.java:276)
              at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
              at oracle.security.jps.ee.http.JpsAbsFilter$1.run(JpsAbsFilter.java:111)
              at oracle.security.jps.util.JpsSubject.doAsPrivileged(JpsSubject.java:313)
              at oracle.security.jps.ee.util.JpsPlatformUtil.runJaasMode(JpsPlatformUtil.java:413)
              at oracle.security.jps.ee.http.JpsAbsFilter.runJaasMode(JpsAbsFilter.java:94)
              at oracle.security.jps.ee.http.JpsAbsFilter.doFilter(JpsAbsFilter.java:161)
              at oracle.security.jps.ee.http.JpsFilter.doFilter(JpsFilter.java:71)
              at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
              at oracle.dms.servlet.DMSServletFilter.doFilter(DMSServletFilter.java:136)
              at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
              at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(Unknown Source)
              at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(Unknown Source)
              at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
              at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
              at weblogic.servlet.internal.WebAppServletContext.securedExecute(Unknown Source)
              at weblogic.servlet.internal.WebAppServletContext.execute(Unknown Source)

              ***********************************************************************
              ***********************************************************************


              Here is what we see in the alert log on the database.


              ***********************************************************************

              Fatal NI connect error 12637, connecting to:
              (LOCAL=NO)

              VERSION INFORMATION:
              TNS for Solaris: Version 11.2.0.3.0 - Production
              opidcl aborting process unknown ospid (24311) as a result of ORA-28
              Tue May 15 15:25:09 2012


              ***********************************************************************

              Fatal NI connect error 12637, connecting to:
              (LOCAL=NO)

              VERSION INFORMATION:
              TNS for Solaris: Version 11.2.0.3.0 - Production
              Oracle Bequeath NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              TCP/IP NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              Time: 15-MAY-2012 15:25:09
              Tracing not turned on.
              Tns error struct:
              ns main err code: 12637

              TNS-12637: Packet receive failed
              ns secondary err code: 12532
              nt main err code: 0
              nt secondary err code: 0
              nt OS err code: 0



              ***********************************************************************

              Fatal NI connect error 12637, connecting to:
              (LOCAL=NO)

              VERSION INFORMATION:
              TNS for Solaris: Version 11.2.0.3.0 - Production
              Oracle Bequeath NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              TCP/IP NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              Time: 06-MAY-2012 20:35:49
              Tracing not turned on.
              Tns error struct:
              ns main err code: 12637

              TNS-12637: Packet receive failed
              ns secondary err code: 12532
              nt main err code: 0
              nt secondary err code: 0
              nt OS err code: 0
              opiodr aborting process unknown ospid (21804) as a result of ORA-609
              "alert_bcvscdei.log" 10614 lines, 405318 characters
              Tue May 15 23:44:36 2012


              ***********************************************************************

              Fatal NI connect error 12637, connecting to:
              (LOCAL=NO)

              VERSION INFORMATION:
              TNS for Solaris: Version 11.2.0.3.0 - Production
              Oracle Bequeath NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              TCP/IP NT Protocol Adapter for Solaris: Version 11.2.0.3.0 - Production
              Time: 15-MAY-2012 23:44:36
              Tracing not turned on.
              Tns error struct:
              ns main err code: 12637

              TNS-12637: Packet receive failed
              ns secondary err code: 12532
              nt main err code: 0
              nt secondary err code: 0
              nt OS err code: 0
              opiodr aborting process unknown ospid (17096) as a result of ORA-609
              • 4. Re: Connection Pool Problem
                Udo
                Hopefully, I've answered your questions in no particular order.
                ;)
                OS on WebLogic is Redhat Enterprise 5.6 (kernel 2.6.18-274.12.1.el5)
                So at least one of the bugs mentioned in the release notes for the JDBC 11.2.0.3 release does not apply (9373409 SUDDENLY JDBC THIN GOT CONNECTION ABORT ERROR ONLY WINDOWS 2008 OR VISTA ENV).
                There's still hope one of the other fixes might help in your case...
                Yesterday I tried deploying to only one server in the cluster (not to cluster) and shutdown the other server to see if this resolved the errors. It did not.

                I don't believe we have anything in the network that would drop the connection like that.
                Hmm, so it seems it'ss not the WLS clustering itself. Any load balancer or something like that on the way to the database? How is your database listener configured?
                Inactivity timeout is 1800 and Abandon Connections is 900. I'm assuming that if inactiviity is the issue (I don't think it is), it would only bring it down the Minimum and not down to zero connections.
                I agree that inactivity shouldn't be the cause. I just wanted to be sure none of them was disabled...
                I think I've seen the steps on patching. It doesn't look too hard.
                It isn't. Though I still think it's likely not to solve the issue, there is a chance and you get it for a manageable effort. It seems to be the best shot for the moment...
                TNS-12637: Packet receive failed
                So, if neither the database, nor the listener or any component in between had been restarted, I'm not sure how to deal with that one. Perhaps SQL*NET log could give some additonal information on the cause of the termination, but as long as your database seems healthy in other situations, it's unlikely that you'll find the actual cause there.
                Looking at the alert block a second time gives me a second insight:
                ns secondary err code: 12532
                ... would be related to ORA-12532: TNS:invalid argument, which is usually caused by a network issue - either some flushed network interface, changed ip address, newly applied firewall rule, ... Could it be there is some regular routine flushing up ip tables or something similar?

                -Udo
                • 5. Re: Connection Pool Problem
                  915154
                  Udo,

                  If I have the following parameters in my apex-config.xml file I want to make sure my assumptions about how they interact at certain events/times:

                  <entry key="apex.jdbc.DriverType">thin</entry>
                  <entry key="apex.jdbc.InitialLimit">10</entry>
                  <entry key="apex.jdbc.MinLimit">10</entry>
                  <entry key="apex.jdbc.MaxLimit">20</entry>
                  <entry key="apex.jdbc.InactivityTimeout">1800</entry>
                  <entry key="apex.jdbc.AbandonedConnectionTimeout">900</entry>

                  I'm assuming that at startup it will create 10 connections. If nothing happens with any of the connections for 1800 seconds, what happens? Does it start closing the connections and rebuilding new ones to keep it at the minimum of 10 connections. I should never see it go below 10 connection, unless it just closed one and it's creating a new one.

                  Is that a correction understanding of how these work.


                  As second case, after initial startup and 10 connections are created there is activity on all connections. I'm assuming it will create new connections as needed to service the requests and it will create as many as it needs to service the requests until it reaches a max of 20 connections.

                  At that time, if they all go inactive for 1800 seconds, it will start tearing down connections until there is a min of 10 connections.

                  Is that correct?

                  Also, I wrote a simple java socket program to connect from one server to another and not pass traffic (idle) and the connection was fine. It lasted several hours with not being reset of broken. So I think the network is not the issue.

                  I will be testing the apex.war file today with the new odbc6.jar and ucp.jar files. Hope that resolves the issue.

                  Thanks,

                  Jack
                  • 6. Re: Connection Pool Problem
                    Udo
                    Hi Jack,
                    <entry key="apex.jdbc.DriverType">thin</entry>
                    <entry key="apex.jdbc.InitialLimit">10</entry>
                    <entry key="apex.jdbc.MinLimit">10</entry>
                    <entry key="apex.jdbc.MaxLimit">20</entry>
                    <entry key="apex.jdbc.InactivityTimeout">1800</entry>
                    <entry key="apex.jdbc.AbandonedConnectionTimeout">900</entry>
                    I'm assuming that at startup it will create 10 connections.
                    Right.
                    If nothing happens with any of the connections for 1800 seconds, what happens?
                    If you have more than 10 connections in your pool (your configuration allows up to 20) the pool size will be decreased to 10 connections starting with the first connection to hit that timeout.
                    Does it start closing the connections and rebuilding new ones to keep it at the minimum of 10 connections.
                    Unless the reuse counter hasn't been hit and you didn't have more than 10 active connections at the same time (i.e., there have never been more than 10 connections in the pool) the pool manager won't close any connection.
                    I should never see it go below 10 connection, unless it just closed one and it's creating a new one.
                    Right. Closing and recreating will happen upon reaching the MaxConnectionReuseCount .
                    As second case, after initial startup and 10 connections are created there is activity on all connections. I'm assuming it will create new connections as needed to service the requests and it will create as many as it needs to service the requests until it reaches a max of 20 connections.
                    Right.
                    At that time, if they all go inactive for 1800 seconds, it will start tearing down connections until there is a min of 10 connections.
                    Basically right. To be precise: It's not necessary that all of them have to be inactive for 1800 seconds. It would be sufficient that less than 10 connections are used at the same time to get the other ones timed out. The pool manager doesn't use round robin to take connections from pool, but more a "most recently used" algorithm.

                    Note that Kris recently published a blog post on that topic: [url http://krisrice.blogspot.de/2012/05/apex-listener-jdbcsecurity-setup.html]Apex Listener jdbc/security setup
                    Also, I wrote a simple java socket program to connect from one server to another and not pass traffic (idle) and the connection was fine. It lasted several hours with not being reset of broken. So I think the network is not the issue.
                    Does it connect to the database listener on the same machine your APEX Listener is connected to?
                    In addition to the network components in between, it could also be your database listener has an inactivity timeout set. You should find information about this in sqlnet.ora (usually in +$ORACLE_HOME/network/admin+ of your database instance) and information about such timeouts being effective in the client log files for sqlnet.

                    -Udo
                    • 7. Re: Connection Pool Problem
                      915154
                      Looks like the issue was caused by the Oracle login security banner. We have removed the banner as a test and it resolved the issue. There is a patch for the ojdbc6.jar (packet.class) file that hopefully allows us to put the banner back in place (since it's a policy). Will try to post later on what I found.
                      • 8. Re: Connection Pool Problem
                        Udo
                        Looks like the issue was caused by the Oracle login security banner. We have removed the banner as a test and it resolved the issue. There is a patch for the ojdbc6.jar (packet.class) file that hopefully allows us to put the banner back in place (since it's a policy).
                        Sounds really interesting. Do you have a MOS-number for the patch?
                        Will try to post later on what I found.
                        Thanks!

                        -Udo
                        • 9. Re: Connection Pool Problem
                          915154
                          Patch 12582023

                          For APEX, we are not using a JDBC connection Pool within WebLogic, which is what the original bug was submitted for. Hopefully, after patching the ojdbc6.jar file within the apex.war file, it will resolve the issue though.

                          Edited by: rowdycholin on Jun 1, 2012 4:58 AM
                          • 10. Re: Connection Pool Problem
                            robertcope
                            Can you tell me if upgrading the JDBC driver did, in fact, fix the issue? Thanks,

                            robert
                            • 11. Re: Connection Pool Problem
                              Udo
                              Hello Robert, and welcome to the OTN forums!

                              I'm not sure if the OP will read your post, as the thread hasn't been updated for quite a while.
                              But you could simply try to access the patch referenced and see if it helps you solve your problem. If it doesn't, you could open a new thread here (and/or an official SR) and describe your issue as detailed as possible.

                              -Udo