I running JSP on Oracle 11g, Weblogic 10.3.4. I have 2 managed server and a oracle admin server installed.
I am encountering an error where intermittently the log file of the 2 managed server and admin server will show java.net.SocketException: Software caused connection abort: socket write error. The application can run for 2 days without showing this error or it can show up a few times in a day. The server load are similar everday.
When this error is been encountered, the server will just stop accepting connections and will not be able to access the application. Even if I try to access the application through localhost, I will not be able to access the JSP pages and a 503 http status is shown but then I am able to access the static HTML page. I will not be able to access the Oracle 11g Weblogic admin console page. When I take a look at admin server log, it shows that the managed servers are disconnected from the admin server and vice versa.
Magically the application is able to recover by its own and the application is able to access again or I need to restart the server as restarting the service of the application does not work.
The FTP connections that the application is connected to are closed as well.
I am able to ping to telnet to the server port. The event log doesn't seem to be leaving any information. We did run wireshark to see the packet traffic and it seems that the application port is sending a RST, ACK packet to the load balancer.
Any kind help will greatly be appreciated. Should you need more info, feel free to ask me.
I think what you are observing is an expected behavior in the following situation:
In case there is a full page request (such as a reload or full page load) occurs, any POST request that is occuring in the background will be interrupted. If the Socket is active when the full page request occurs, this exception will occur.
I think there would be no functionality loss as a result of this exception. There should only be a log.
Additionally, you may consider running the WLS process with the following parameters to use the JDK HTTP handler like:
When getting the mentioned exception, the weblogic server will stop responding thus the application is not accessible. Even if the weblogic service is been restarted, it does not help. Only a restart of the server allow the application to be accessible again. This problem might occur a few time in a day or 2 week once..
It seems that the pageContext.getOut().flush() caused the socket write error. (Meaning flushing)?? It look like the stream has been closed. We have pageContext.getOut().flush() in our doEndTag method of a Tag class as we are using JSTL tag. We call the pageContext.getOut().clearBuffer() manually and the application came alive. This include the console admin page as well which we are able to access. Just wondering, what could cause the buffer unable to flush or stream to be closed? Any experts will like to comment