We are seeing occasional instances of stuck threads, where the code is in Oracle thin XA driver code waiting for a socket read for a response from the database. Usually only one thread at a time. The connection never terminates and we have to shut down WebLogic to clear it. The DBAs tell us that all connections to the 11g RAC cluster members from our hosts are idle, with no hung or long-running transactions.
This happens every few days across two WebLogic 10.3.5 clustered instances in our QA environment, usually not the same time on two servers (one had four incidents, the other had only one, in the last 15 or so days). Sometimes it has resulted in a hung server, as the driver is holding a lock that blocks other threads (today we had numerous threads block in a TX rollback).
I'm guessing our WebLogic instance somehow is not getting the connection close from the remote host. BTW, there is no firewall between us and the DB. I have not got any strong suspect for the cause (although we are running Linux on a VMWare 4.0 VM, which always worries me).
Ordinarily I'd either ask the application to set a query timeout, or set the query timeout parameter for the connection pools as a workaround (we are using a MultiDatasource), so the transaction would at least abort and the application can handle it. However, while the Oracle driver does support java.sql.Statement.setQueryTimeout(), some brief investigation on my part leads me to believe the timeout is implemented on the server side and not the client side - so that if it is indeed the case that WebLogic does not see the close on the connection, it also would never see the timeout.
1. Is my suspicion about server-side implementation of timeout for Oracle 11g correct?
2. If so, is there some property I can set for the driver that will implement a socket timeout? Is this "safe"?
Thanks for any help!
[ Oracle WebLogic 10.3.5, HotSpot 64 bit JVM 1.6.0_29 on RHEL 5.6 on VMWare; Oracle Thin XA driver bundled with WebLogic; Oracle RAC 11g (three-node cluster)]
Edited by: SteveElkind on Dec 3, 2012 5:25 AM
Hi, yes, there is a property you can give the driver to make all socket read calls with a timeout.
oracle.net.READ_TIMEOUT. The units are in milliseconds, so choose some number high
enough that no legitimate read, such as a wait for the DBMS to do an update or commit,
etc, would take longer.
Ignore the post by Mukesh. There is nothing relevant there.
My investigations were heading in this direction (e.g., http://docs.oracle.com/cd/B28359_01/java.111/b31224/apxtblsh.htm#CHDBBDDA). However, for WebLogic, is it as simple as adding the following property in the Datasource Connection properties edit box in the WebLogic console?
or do I have to edit the connection URL?
I've tried this, and the Datasource restarts OK after the change, but I have no way right now to check whether it actually "works".
(we have some long-running queries; five minutes seems to be a "safe" limita for now)