8 Replies Latest reply on Sep 22, 2010 10:01 PM by 786911

    Weblogic proxy plugin closes keep-alive connections to clients randomly

    770277
      In short we have following arhitecture:

      clients ---> wl proxy plugin 1 ----> weblogic 1
      clients ---> wl proxy plugin 2 ----> weblogic 2

      Beacuse of the application/installation specific requirements, we are not using failover, one wl proxy always forwards requests to one weblogic (simple configuration).

      Application is TR-069 protocol based (SOAP over HTTP) so it very much relays on persistence TCP connections (Connection: keep-alive). This TCP persistence has to work correctly in order that TR-069 messages are exchanged in required order, otherwise we have a error on application layer.

      Here and there we've noticed applications errors which suggest that we have some problems in TCP connection between the client and the weblogic server. After sniffing, we've noticed that weblogic proxy plugin (Apache) randomly, or because of some other reason we do not know, decides to close TCP connection to client, even app on weblogic did not request so ???

      As a result, client opens new connection to the server with new TR-069 session and it gets bounced beacuse it allready has one open on weblogic server.

      We've sniffed, traced everything we could, we were searching for patterns in time, etc... but we can not find the reason why proxy plugin decides to close the connection to the client (not to the weblogic server).


      Trace (replaced sensitive information):

      .........
      Thu Apr 29 15:05:50 2010 <958012725463463784> URL::parseHeaders: CompleteStatusLine set to [HTTP/1.1 200 OK]
      Thu Apr 29 15:05:50 2010 <958012725463463784> URL::parseHeaders: StatusLine set to [200 OK]
      Thu Apr 29 15:05:50 2010 <958012725463463784> parsed all headers OK
      Thu Apr 29 15:05:50 2010 <958012725463463784> sendResponse() : r->status = '200'
      Thu Apr 29 15:05:50 2010 <958012725463463784> canRecycle: conn=1 status=200 isKA=1 clen=545 isCTE=0
      Thu Apr 29 15:05:50 2010 <958012725463463784> closeConn: pooling for '$IP$/$PORT$'
      Thu Apr 29 15:05:50 2010 <958012725463463784> request [$URL$] processed successfully..................

      !!!! Now it closes the TCP connection and inserts "Connection: close" HTTP header !!!


      WL proxy plugin conf params are:

      WebLogicCluster $IP$:$PORT$
      DynamicServerList OFF
      KeepAliveTimeout 90
      MaxKeepAliveRequests 0
      KeepAliveSecs 55

      Apache worker configuration is:

      <IfModule mpm_worker_module>
      PidFile var/run/httpd-worker.pid
      LockFile var/run/accept-worker.lock
      StartServers 2
      MinSpareThreads 25
      MaxSpareThreads 75
      ThreadLimit 200
      ThreadsPerChild 200
      MaxClients 2000
      MaxRequestsPerChild 0
      AcceptMutex pthread
      </IfModule>


      Why weblogic proxy plugin ignores Keep-alive directive and decides to close connection to the client by itself?


      Any help?
        • 1. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
          René van Wijk
          From the documentation the KeepAliveSecs is the length of time after which an inactive connection between the plug-in and WebLogic Server is closed. So this not between the client and the apache webserver.

          For the time-out between the client and the apache you can use
          KeepAliveTimeout 15 (Number of seconds to wait for the next request from the same client on the same connection), which is located in the httpd.conf file.

          See also this link http://httpd.apache.org/docs/2.0/mod/core.html and look for the KeepAliveTimeout directive
          • 2. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
            770277
            KeepAliveTimeout 90
            That parameter is not the problem, it is set to 90 secs. Notice that proxy plugin closes TCP connection after it forwards response from server to the client, so there was no timeout.

            What we've noticed is that proxy plugin closes TCP connections when there is significant load, it seems that it has some logic by which it saves resources. What interests me is whether there is some hidden config parameter which controls this threshold, ex. the max number of parallel connections? It seems that it closes TCP connections when some threshold is reached....

            We've found following available parameters on proxy module (direct debug from apache):

            MaxSkips - Defines MaxSkips
            MaxSkipTime - Defines MaxSkipTime
            EnforceBasicConstraints - Whether basic constraints checking is enforced

            We did not set them (they are not documented), so they are on defaults.

            Can someone help to explain those 3 parameters, what does MaxSkipTime means, can these parameters help with this issue?
            • 3. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
              René van Wijk
              If a WebLogic Server instance listed in either the WebLogicCluster parameter or a dynamic cluster list returned from WebLogic Server fails, the failed server is marked as "bad" and the plug-in attempts to connect to the next server in the list.

              MaxSkipTime sets the amount of time after which the plug-in will retry the server marked as "bad." The plug-in attempts to connect to a new server in the list each time a unique request is received (that is, a request without a cookie).

              Note: The MaxSkips parameter has been deprecated as the MaxSkipTime parameter.

              See also here: http://download-llnw.oracle.com/docs/cd/E13222_01/wls/docs81/plugins/plugin_params.html


              You said the problem arises under significant load. Maybe, it is wise to tune the number file descriptor's on your operating system. HTTP connections are nothing more than TCP sockets on the operating system. All modern operating systems treat sockets as a specialized form of file access and use data structures called file descriptors to track open sockets and files for an operating system process. To control resource usage for processes on the machine, the operating system restricts the number of open file descriptors per process. You should be aware that all TCP connections that have been gracefully closed by an application will go into what is known as the TIME_WAIT state before being discarded by the operating system.

              On most unix systems you can use netstat -a | grep TIME_WAIT | wc -l to detemine the number of socket in time_wait state. You have to check with your system adminstrator how to tune the tcp_time_wait_interval. On solaris you can use: /usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 60000
              • 4. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
                770277
                Limit on Apache file descriptors is set to 16384. Number of TIME_WAIT connections is around 1000, total number is around 2000. So this limit is not the problem.
                Thu Apr 29 15:05:50 2010 <958012725463463784> sendResponse() : r->status = '200'
                Thu Apr 29 15:05:50 2010 <958012725463463784> canRecycle: conn=1 status=200 isKA=1 clen=545 isCTE=0
                Thu Apr 29 15:05:50 2010 <958012725463463784> closeConn: pooling for '$IP$/$PORT$'
                If you remember debug log, proxy plugin seems to decide to close the connection, that is why i believe that this is some threshold inside weblogic proxy plugin.
                • 5. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
                  René van Wijk
                  Just a guess, from the apache documentation on workers (http://httpd.apache.org/docs/2.0/mod/worker.html)

                  Could you place the ThreadLimit directive before the other directives in the worker (as should be before other directives according to the documentation). Maybe the default of 64 is not overriden the way you have configured it.

                  From the documentation, i think this also may be relavent to your configuration of the apache server

                  In addition to the set of active child processes, there may be additional child processes which are terminating but where at least one server thread is still handling an existing client connection. Up to MaxClients terminating processes may be present, though the actual number can be expected to be much smaller. This behavior can be avoided by disabling the termination of individual child processes, which is achieved by the following:
                  * set the value of MaxRequestsPerChild to zero
                  * set the value of MaxSpareThreads to the same value as MaxClients
                  1 person found this helpful
                  • 6. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
                    771808
                    What is the value of accept backlog set at weblogic end?? By default value of accept backlog is 50.

                    If this issue is observed on load testing then try increasing the value of accept backlog and check the performance.

                    For accept backlog refer the following link:

                    http://download.oracle.com/docs/cd/E13222_01/wls/docs90/perform/WLSTuning.html
                    • 7. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
                      770277
                      Hi,

                      We've located misconfigured parameter, it was MaxSpareThreads. In order not to close active TCP connections it should follow the value of MaxClients parameter.

                      What happened is that after significant load, one apache process has many inactive threads (threads that already done their job), and when that number exceeds MaxSpareThreads parameter it begins to close TCP connections in all active threads and finally exits. I believe that this is some lind of anti-fragmentation tuning parameter.

                      Apache worker doc says:
                      In addition to the set of active child processes, there may be additional child processes which are terminating but where at least one server thread is
                      still handling an existing client connection. Up to MaxClients terminating processes may be present, though the actual number can be expected to be
                      much smaller. This behavior can be avoided by disabling the termination of individual child processes, which is achieved by the following:

                      * set the value of MaxRequestsPerChild to zero
                      * set the value of MaxSpareThreads to the same value as MaxClients
                      So now we have following:

                      ....
                      MaxSpareThreads 2000
                      MaxClients 2000
                      ....

                      Rene, thnx for the hint, you were close :)
                      • 8. Re: Weblogic proxy plugin closes keep-alive connections to clients randomly
                        786911
                        Hi

                        We are experiencing similar problem. However my guess is that the problem lies with the mod_wl plugin. It looks like the problem has nothing to do with the "MaxSpareThreads" and "MaxClients" directives on the Apache side. We made these changes and still see the problem.
                        However checking the maximum number of connections (max clients) gives us not more than 100 anytime.


                        Also I realized you did not mention the mod_wl plugin version you are using. I am anyways trying an other alternative to using the plugin. This confirms if plugin is the culprit.