We are using glassfish in a recent release project to production, with the increase in the load and usage, the glassfish started to demonstrate an abnormal behaviour.
It locks in transactions. From our analysis is not always related with special loads, for example it can be without problem for one week and them it block’s twice in one day. The only solution is to do a restart and lost whatever was executing…
When these locks occur most of the time there isn’t any Database lock.
The impact now is huge, and without solution at sight…
We are using EJB’s 3.0 with JPA and toplink. And Glassfish with versions:
- Sun Java System Application Server 9.1_02 (build b04-fcs) and Sun Java System Application Server 9.1 (build b58g-fcs)
We have tried almost every thing, simplify some of our code, reduce transactions, install different Glassfish versions and some other tips out the in these posts, without success.
From two thread dumps we notice the following pattern:
at java.lang.Thread.sleep(Native Method)
The stack has a deferred lock on it, which means two threads tried to build the same set of objects from the database at the same time, and one is waiting for the other to be complete, it should not be a deadlock as the other should complete. Please include the full thread stack dump from all blocked threads, its takes at least two threads to have a deadlock. Are the threads deadlocking, or just taking a long time?
If the issue is with deferred locks you can avoid these by ensuring you use lazy relationships. Make sure all of your relationships are marked as lazy. Usage of join fetching can also require deferred locks, so check if you are using join fetching, perhaps switch to batch reading.
There is also a back door to disable deferred locks that can be used to determine the cause.
DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false;
If the deadlock is occurring in the cache, as with deferred locks, then a possible solution is to disable caching, using the persistence.xml property "toplink.cache.shared.default"="false".
You can also configure the TopLink cache isolation using, DatabaseLogin.setCacheTransactionIsolation(), try CONCURRENT_READ_WRITE.
I would also recommend trying to upgrade to EclipseLink or TopLink 11.
We ran into a similar situation in toplink 10g, we had a call back into toplink for one of our custom fields and in the custom field we were trying to do a toplink query befor a uow.commit, we removed the
query from the callabck and everything worked, also we had the cache usage to readcacheonly which was very bad