WebLogic Server diagnoses a thread as stuck if it is continually working (not idle) for a set period of time. You can tune stuck threads in the Weblogic Server.
You can tune a server’s thread detection behavior by changing the length of time before a thread is diagnosed as stuck, and by changing the frequency with which the server checks for stuck threads.
It can be configured using below mentioned 2 parameters. These can be configured from Admin console -> Servers ->Configuration > Tuning tab.
1)Stuck Thread Max Time
2)Stuck Thread Timer Interval
Although you can change the criteria WebLogic Server uses to determine whether a thread is stuck, you cannot change the default behavior of setting the “warning” and “critical” health states when all threads in a particular execute queue become stuck.
There is a parameter "Stuck Thread Count" which can be configured from Console
(Servers -> Configuration ->Overload -> Stuck Thread count) which helps to transition the Server to FAILED state once the stuck threads reaches the value.
You can also use "OverloadProtectionMBean" for tuning. In Weblogic Server 9.x and later, it is recommended that you use the "ServerFailureTriggerMBean" in the "OverloadProtectionMBean".The ServerFailureTriggerMBean transitions the server to a FAILED state after the specified number of stuck threads are detected.The OverloadProtectionMBean has options to suspend or shutdown a failed server.
WebLogic Server checks for stuck threads periodically. If all application threads are stuck, a server instance marks itself failed, if configured to do so, exits. You can configure Node Manager or a third-party high-availability solution to restart the server instance for automatic failure recovery.
You can configure these actions to occur when not all threads are stuck, but the number of stuck threads have exceeded a configured threshold:
- Shut down the Work Manager if it has stuck threads. A Work Manager that is shut down will refuse new work and reject existing work in the queue by sending a rejection message. In a cluster, clustered clients will fail over to another cluster member.
- Shut down the application if there are stuck threads in the application. The application is shutdown by bringing it into ADMIN mode. All Work Managers belonging to the application are shut down, and behave as described above.
- Mark the server instance as failed and shut it down it down if there are stuck threads in the server. In a cluster, clustered clients that are connected or attempting to connect will fail over to another cluster member.
You can configure the "ServerFailureTriggerMBean" in the "OverloadProtectionMBean".
Below is documentation link for "ServerFailureTriggerMBean" methods.
You can use getStuckThreadCount() method to check the number of stuck threads and transition the server to Failed State once itreaches the limit.
int getStuckThreadCount() - The number of stuck threads after which the server is transitioned into FAILED state. There are options inOverloadProtectionMBean to suspend and shutdown a FAILED server. By default, the server continues to run in FAILED state. If the StuckThreadCount value is set to zero then the server never transitions into FAILED server irrespective of the number of stuck threads.
The StuckThreadCount value
Below is documentation link for "OverloadProtectionMBean" methods.
In the Admin console, you can set the "FailureAction" under Servers->Configuration->Overload to force shutdown the managed server once the server is in Failed state.
The OverloadProtectionMBean has a method getFailureAction to achieve the same.
String getFailureAction() - Enable automatic forceshutdown of the server on failed state. The server self-health monitoring detects fatal failures and mark the server as failed. The server can be restarted using NodeManager or a HA agent.
If you start the managed servers using node manager, you can enable "Auto Kill if Failed" and "Auto Restart" in the Admin console, under Servers-> configuration->Health Monitoring. Node Manager will take care of restarting the managed server if you enable "Auto Restart".
You can also configure the "Stuck Thread Count" and "Failure Action" to "Force Immediate shutdown of the Server" from Admin console under servers-> configuration-> Overload. This will help you to shutdown the server when the stuck thread count is reached. But there is no way to release the threads once they are stuck from the configuration.
Suppose if you set the value of the Stuck Thread Count to 20, The server will be transitioned to failed state once the count reaches 20 and if you enable the Failure Action, the server self-health monitoring detects fatal failures and mark the server as failed.
Depending on how you are starting the servers (custom scripts or Node Manager or startup scripts,...), you can restart the servers.
For more details on this please refer to below link: