We have a Distributed JMS queue in our application which has the redelivery retry limit set as 14 with a Redelivery Delay Override as 86400000 ms. That means, a message in this queue will be retried until 14 days, one retry per day and then reaches error destination. We have a logic in our application where it takes JMS property, to retrieve current Redelivery count of each message in queue and log it in a table.
We noticed the following behavior for few messages in this JMS queue:-
Suppose take a message M1 put in this queue today say 1/5/2013. And this message should ideally reach error destination queue after 14 days i.e. on 1/20/2013 only if the message is not successfully delivered. And on 1/7/2013, redelivery count is as '2' and 1/8/2013 as '3' which is correct. But on 1/9/2013 it is still as '3' and say on 1/15/2013 it will be having a redelivery count as '6' which is not correct.
So question is why few messages in a JMS queue are not getting retried on some particular days. And the message which is supposed to be reaching error destination on 1/20/2013 will reach probably on some date after this. Is there any reason for this kind of behavior for JMS messages? Can any one help me in understanding this?
Edited by: 980116 on Jan 5, 2013 10:58 PM
I don't recall that there are any known problems with long term timers.
* Retry your testing with, say 10 second, instead of 1 day timers, to see if you can recreate the problem with a quick reproducer.
* Regardless, note that the redelivery count cannot be exact - there are conditions were a redelivery can occur even if the application never got the message, which would cause the message to appear to "disappear" for a day. Also, shutting down your server for long periods would throw off your current algorithm. In addition, it looks like a large backlog could also throw off your algorithm - if say, a message becomes visible, but it still takes a long time for your system to retrieve and process the message (due to a long wait in the queue and/or a lengthy message processing step), then the number of tries-per-day is necessarily going to be less than 1.
* I'm not sure, but I think it might help your design if you can consider using a more frequent retry, plus some sort of very simple application logic that checks the age of a message and reacts to 14 day messages rather than introduce a dependency on the delivery count. Also, you might want to consider reducing the "MessagesMaximum" configuration setting on your connection factories to one (the default is 10) -- this will reduce/eliminate the chance that a redelivery of a message that's already in the pipeline of an asynchronous consumer/MDB, but not seen by an application, is silently forced to redeliver on an error.