6 Replies Latest reply on Jul 26, 2010 3:11 PM by 805009

    OpenMQ stops delivering messages

    807581
      Hello,

      As the subject says, our OpenMQ asynchronous MessageConsumers (i.e. they get messages via the 'onMessage(...)' method) clients stop receiving messages, although there are still messages in the queue matching the consumer's message selector.

      Our situation is that we have quite a few consumers (tens to hundreds) on multiple queues, differentiating via message selectors, which are "bridging" messages to external systems and they can be started/stopped independently.
      Originally, each consumer used it's own JMS connection to the broker but we had problems with this approach as this seems to consume a lot of memory and establishing a connection often needed quite some time. Thus we changed the model so a few (one to twenty) consumers are collocated in the same process and share a single JMS connection but each uses it's own JMS session - which seems to be where the problems of "non-delivery" started.
      Sometimes, when a consumer is stopped and restarted afterwards (i.e. it stops and re-creates it's JMS session), it simply won't consume any messages - restarting the whole process (i.e. using a new JMS connection) helps, only restarting the consumer (i.e. using a new JMS session) does not.

      I remember to have read that one should not register an asynchronous consumer on an already started connection, on the other hand, I cannot find anything definitly saying that one may not do so, so the question is: Could this be the source of our problems? If it is, how should we solve this? Using one connection per consumer doesn't seem to work so well, as mentioned above. Any ideas on how to diagnose why this happens?

      kind regards,

      Messi
        • 1. Re: OpenMQ stops delivering messages
          805009
          A few questions:

          When you see the situation you describe, where a MessageListener is not receiving messages even though there are messages on the queue that match its message selector, are there other MessageListeners on the same queue which are still receiving messages?

          Do your message selectors on a given queue "overlap"? That is, for a given message, is there only one consumer whose message selector will match that message, or might there be more than one?

          Nigel

          P.S. I'm not aware of any problems caused by starting a consumer on an already-started connection.
          • 2. Re: OpenMQ stops delivering messages
            807581
            Hello Nigel,

            The message selectors are not overlapping, i.e. the various consumers are disjoint and yes, other consumers of the same and other queues still receive messages (just verified that from our logs).

            The problem doesn't occur that often and locally I have difficulties to reproduce it - I'm 99% sure I could reproduce it locally once with OpenMQ 4.3 but not with 4.4, so we upgraded but the problem re-occured on our production machine.

            It seems that if I start another process with its own connection, which unconditionally consumes the messages from the queue but rolls back afterwards, the messages get "released" to their "correct" consumers - we saw this when looking into the queue with our admin tool.

            The code is more or less verified to work, as the exact same code works well with SonicMQ in production environments (we're in the process of migrating to OpenMQ).

            Forgot to mention in my first post: It occurs both with OpenMQ 4.3 and 4.4, we don't have any MQ clustering.

            kind regards,

            Messi
            • 3. Re: OpenMQ stops delivering messages
              807581
              Hello again, Nigel,

              Currently I'm trying to reproduce the behavior locally but always fail because whenever I try to run the test (50 disjoint consumers on a queue, 1000 messages each, gradually produced) I always get an exception on the broker, which seems to block the clients - this is not what happens on the server, as no broker exception occurs there, just thought you might want to know - should I post an issue? I didn't because I just saw this error was (locally) produced using Java-Runtime-Version 1.6.0_21-ea. I'll try changing it.

              OpenMQ Version: 4.4 Update 1 (Build 7-b)

              The broker log shows the following exception:
              java.util.ConcurrentModificationException
                   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
                   at java.util.HashMap$ValueIterator.next(HashMap.java:822)
                   at com.sun.messaging.jmq.util.lists.WeakValueHashMap$ValueIterator.hasNext(WeakValueHashMap.java:357)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet.addAllOrdered(NFLPriorityFifoSet.java:798)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet.addAllOrdered(NFLPriorityFifoSet.java:677)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet$FilterSet.addAllOrdered(NFLPriorityFifoSet.java:310)
                   at com.sun.messaging.jmq.jmsserver.core.Consumer.destroyConsumer(Consumer.java:496)
                   at com.sun.messaging.jmq.jmsserver.core.Session.detatchConsumer(Session.java:982)
                   at com.sun.messaging.jmq.jmsserver.core.Session.detatchConsumer(Session.java:838)
                   at com.sun.messaging.jmq.jmsserver.data.handlers.ConsumerHandler.destroyConsumer(ConsumerHandler.java:577)
                   at com.sun.messaging.jmq.jmsserver.data.handlers.ConsumerHandler.handle(ConsumerHandler.java:422)
                   at com.sun.messaging.jmq.jmsserver.data.PacketRouter.handleMessage(PacketRouter.java:181)
                   at com.sun.messaging.jmq.jmsserver.service.imq.IMQIPConnection.readData(IMQIPConnection.java:1355)
                   at com.sun.messaging.jmq.jmsserver.service.imq.IMQIPConnection.process(IMQIPConnection.java:542)
                   at com.sun.messaging.jmq.jmsserver.service.imq.OperationRunnable.process(OperationRunnable.java:170)
                   at com.sun.messaging.jmq.jmsserver.util.pool.BasicRunnable.run(BasicRunnable.java:493)
                   at java.lang.Thread.run(Thread.java:619)
              [21/Jul/2010:16:45:15 UTC] FEHLER [B3100]: Unerwarteter interner Broker-Fehler: [Uncaught Exception]:
              com.sun.messaging.jmq.jmsserver.util.BrokerException: [B4117]: Unerwarteter Broker-Ausnahmefehler: [Unexpected Error processing message]
                   at com.sun.messaging.jmq.jmsserver.data.PacketRouter.handleMessage(PacketRouter.java:205)
                   at com.sun.messaging.jmq.jmsserver.service.imq.IMQIPConnection.readData(IMQIPConnection.java:1355)
                   at com.sun.messaging.jmq.jmsserver.service.imq.IMQIPConnection.process(IMQIPConnection.java:542)
                   at com.sun.messaging.jmq.jmsserver.service.imq.OperationRunnable.process(OperationRunnable.java:170)
                   at com.sun.messaging.jmq.jmsserver.util.pool.BasicRunnable.run(BasicRunnable.java:493)
                   at java.lang.Thread.run(Thread.java:619)
              Caused by: java.util.ConcurrentModificationException
                   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
                   at java.util.HashMap$ValueIterator.next(HashMap.java:822)
                   at com.sun.messaging.jmq.util.lists.WeakValueHashMap$ValueIterator.hasNext(WeakValueHashMap.java:357)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet.addAllOrdered(NFLPriorityFifoSet.java:798)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet.addAllOrdered(NFLPriorityFifoSet.java:677)
                   at com.sun.messaging.jmq.util.lists.NFLPriorityFifoSet$FilterSet.addAllOrdered(NFLPriorityFifoSet.java:310)
                   at com.sun.messaging.jmq.jmsserver.core.Consumer.destroyConsumer(Consumer.java:496)
                   at com.sun.messaging.jmq.jmsserver.core.Session.detatchConsumer(Session.java:982)
                   at com.sun.messaging.jmq.jmsserver.core.Session.detatchConsumer(Session.java:838)
                   at com.sun.messaging.jmq.jmsserver.data.handlers.ConsumerHandler.destroyConsumer(ConsumerHandler.java:577)
                   at com.sun.messaging.jmq.jmsserver.data.handlers.ConsumerHandler.handle(ConsumerHandler.java:422)
                   at com.sun.messaging.jmq.jmsserver.data.PacketRouter.handleMessage(PacketRouter.java:181)
                   ... 5 more


              The client threads block on:
              java.lang.Thread.State: TIMED_WAITING (on object monitor)
                   at java.lang.Object.wait(Native Method)
                   at com.sun.messaging.jmq.jmsclient.AckQueue.dequeueWait(AckQueue.java:148)
                   - locked <0xb48822d0> (a com.sun.messaging.jmq.jmsclient.AckQueue)
                   at com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketWithAck(ProtocolHandler.java:627)
                   at com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketWithAck(ProtocolHandler.java:575)
                   at com.sun.messaging.jmq.jmsclient.ProtocolHandler.writePacketWithReply(ProtocolHandler.java:430)
                   at com.sun.messaging.jmq.jmsclient.ProtocolHandler.createMessageProducer(ProtocolHandler.java:1266)
                   at com.sun.messaging.jmq.jmsclient.ProtocolHandler.createMessageProducer(ProtocolHandler.java:1234)
                   at com.sun.messaging.jmq.jmsclient.MessageProducerImpl.<init>(MessageProducerImpl.java:113)
                   at com.sun.messaging.jmq.jmsclient.QueueSenderImpl.<init>(QueueSenderImpl.java:64)
                   at com.sun.messaging.jmq.jmsclient.UnifiedSessionImpl.createSender(UnifiedSessionImpl.java:164)
                   at com.sun.messaging.jmq.jmsclient.UnifiedSessionImpl.createProducer(UnifiedSessionImpl.java:516)
              [... customer code ... ]
                   at com.sun.messaging.jmq.jmsclient.MessageConsumerImpl.deliverAndAcknowledge(MessageConsumerImpl.java:338)
                   at com.sun.messaging.jmq.jmsclient.MessageConsumerImpl.onMessage(MessageConsumerImpl.java:273)
                   at com.sun.messaging.jmq.jmsclient.SessionReader.deliver(SessionReader.java:113)
                   at com.sun.messaging.jmq.jmsclient.ConsumerReader.run(ConsumerReader.java:186)
                   at java.lang.Thread.run(Thread.java:619)

              kind regards,

              Messi
              • 4. Re: OpenMQ stops delivering messages
                805009
                I asked my previous questions to find out whether you might be seeing some artefact of MQ flow control, in which messages are by default pre-fetched to consumers in blocks of up to 1000 messages. Where there are multiple consumers on the same queue (and there are not disjoint message selectors) this can cause one consumer to be idle whilst the other is still consuming messages. It sounds as if this is not the case, but it's worth being aware of possible effects of pre-fetching when investigating this issue.

                We did have a bug that sounds a bit like this recently: [https://mq.dev.java.net/issues/show_bug.cgi?id=50|https://mq.dev.java.net/issues/show_bug.cgi?id=50] I don't know whether this is the same bug, but it has now been fixed in promoted 4.5 build 11. The fact that we were able to fix it was due in no small part to a lot of work by the submitter in creating a reproduceable test case...

                ...which brings us to the java.util.ConcurrentModificationException. Hmm. I think at least one such bug (in this part of the code) may have been fixed in 4.4u2. Can you see if it occurs with 4.4u2, which is the latest stable release? (See [https://mq.dev.java.net/downloads.html|https://mq.dev.java.net/downloads.html] ).

                You wouldn't want to use it for production, but you could also try the latest build of MQ 4.5 (From [https://mq.dev.java.net/4.5.html#download|https://mq.dev.java.net/4.5.html#download] ).

                Nigel
                • 5. Re: OpenMQ stops delivering messages
                  807581
                  Hello Nigel,

                  Retesting with JDK6 stable and OpenMQ 4.4 - of course I read the issue you mentioned (as well as linked ones), so I found glassfish issue 4222 (https://glassfish.dev.java.net/issues/show_bug.cgi?id=4222) which seems to be quite like our issue, the difference being that we use event driven consumers - they probably wouldn't get the message if 'receive()' blocked, correct?
                  The issue is still open. Do you know whether this issue is glassfish or MQ related or any further details?

                  Thanks for all the help today!

                  kind regards,

                  Messi
                  • 6. Re: OpenMQ stops delivering messages
                    805009
                    A couple of updates:

                    The ConcurrentModificationException bug I had in mind was [http://bugs.sun.com/view_bug.do?bug_id=6788876|http://bugs.sun.com/view_bug.do?bug_id=6788876] has been fixed in 4.4u2p1 and 4.5, but not 4.4u2 as I suggested before.

                    I took a look at [Glassfish 4222 |https://glassfish.dev.java.net/issues/show_bug.cgi?id=4222] , which you referred to. This bug is still open and a quick test suggests it remains to be fixed, so I've raised its priority to make sure it gets looked at.