We have a distributed topic. The message producer is a multi-threaded/connection producer (OSB in our case).
On the other side of the DT there are single connection consumers.
Our problem is that the consumers sometimes recieve the messages in different order:
Consumer A recieves message N and then N+1 (correct behaviour).
Consumer B recieves message N+1 and then N (correct behaviour).
This does not happen all the time. Most of the time the order is correct (99.9..%).
We also noticed that the message timestamp for message N and N+1 is the same.
This creates a problem for our system since both consumers must get the messages in the same order.
We cannot use UOO.
Any idea if this is a correct behavior for DT?
Any idea how to make the messages get to all consumers in the same order without using UOO?
Hmm. You may have run into some new Bug given since "message timestamp for message N and N+1 is the same". First, let's dot our I's and cross our Ts to make sure that the system is setup in a way that will give you good ordering in the first place:
- Always use the same producer instance for each sequential send, and ensure that the producer's connection factory has "load balance" set to false. Note that OSB may be using a pool of producers implicitly for your sending app underneath-the-covers, which could lead to out-of-order, as different producers can load balance to different servers in your cluster. (I'm not sure how to check for this - if you provide a code snippet, I may be able to tell if this is happening.)
- Ensure that you only ever have a single thread processing the subscription - if you're using MDBs to receive from the subscription, then you need to ensure the MDB is setup to only startup a single thread (use a thread pool of size and/or set max-beans-in-free-pool to one)
- To account for out-of-order after app message processing failures (redelivery) with async consumers or MDBs, you need to (A) never configure redelivery delays, and (B) ensure the connection factory Maximum Messages setting is tuned down to 1 (it's default is 10).
I don't know how to do the above with OSB, which tends to layer it's own configuration on top of WL JMS configuration.
Assuming that you've assured all of the above, then you may have run into some sort of bug and I recommend filing a request with Oracle Support. In the mean-time, you might want to explore alternatives:
- Are you certain that you can't use UOO? It would be interesting to know why. This is a widely used feature (even by OSB itself), and it may be that you can enable it without any code change, plus, even if a code change is required, such changes tend to be isolated and small. In your use case, it looks like you may be able to configure a default UOO on the distributed topic itself (http://docs.oracle.com/cd/E11035_01/wls100/wlsmbeanref/core/index.html) - no code change needed - so that every destination member of your distributed destination will simply ensure that each new message is given the same UOO (a UOO that's particular to that member). OSB may provide some equivalent knob. Alternatively, you can code up your app so that each of your producers sets a useful UOO on each message.
- I assume you're using a replicated distributed topic (RDT). If there actually is a bug, then it would likely have something to do with the RDT's internal forwarders. If your consumers are MDBs or the SOA RA Adapter, then a simple alternative may be to use a Partitioned Distributed Topic (PDT) instead -- PDTs have no forwarders, and the MDB and the SOA RA Adapter can work with them transparently. If your consumers are neither of these, then working with a PDT will likely require that you use extensions (an advanced path that you may not want to take the time to follow).