4 Replies Latest reply: Dec 30, 2010 10:17 PM by 825489 RSS

    Feedback on use of incubator command pattern


      We are currently prototyping some different solutions using coherence incubator (namely command pattern) and are looking for some feedback as to the viability and potential improvements to the solution.

      h3. Summary of Prototype
      The prototype does the following (i have a nice sequence diagram for this but don't see a way to attach it :():

      + client (e.g. through coherence extend) calls local api to save a "message" for a particular account (e.g. Account id = 1234). This calls namedcache.put and inserts an entry into the cache.
      + BackingMapListener is configured for the cache into which the client indirectly inserts. In the prototype this is a spring bean that extends AbstractMultiplexingBackingMapListener - which is fully "loaded" with all the required dependencies for the processing of the message (services, etc.).
      + The listener then registers a new context (using ContextManager) using a "grouping" id based on the sequence/ordering requirements. For example, say that each message against an account needs to be processed in order. The context would get instantiated with name = "1234", so that subsequent requests for account 1234 will get queued against the context with the same name whilst the previous request(s) are still processing. Messages for other accounts would register a different context name so they will get simultaneously processed.

      NB: The functionality of this listener can be paralleled to the sample in CommandPatternExample for one submission. I am not entirely clear where command submissions typically "tie-in" but I am planning to kick them off from a backingmaplistener. I briefly explored using the 'com.oracle.coherence.common.events.dispatching.listeners.DelegatingBackingMapListener' to dispatch the commands but not entirely how this would tie in. As I understand it the delegating backingmaplistener is used within the 'liveobjects' context and dispatches entries that implement the LifecycleAwareEntry but not sure how we would create "custom-contexts" as we require (i.e. the identifier is not for the key of the cache entry but rather a subset of that -e.g. account id versus account message id).

      + A command is then created to process the account message, which is comprised of
      - the Account which needs processed (the value of the backing map listener contains the Account itself)
      - Any components that are required during processing (services, daos, etc - service might itself be injected with daos, etc.)
      + The newly instantiated command is then then submitted to the CommandSubmitter for the appropriate contextIdentifer (the one returned by 1234 in our example).

      From some basic tests, the prototype is behaving as I desire - i.e. it queues and "synchronizes" the commands for the same context and also simultaneously processes commands assigned to different contexts asynchronously. That's great.

      However, there are a number of things I am exploring for the actual implementation. I believe most of these are typical concerns so I wonder if Oracle or anyone can provide some feedback from past experience/proposed recommendations:

      h3. Questions

      h4. 1. Grid/server-side Business Logic Deployment

      One of the things that has occurred to us is that ideally we would like to store the business processing logic (i.e. the heart of the processing within the command) either inside the grid or within a coherence node (i.e. made available through the classpath of the node startup).

      In our case we have a few different "processing models", but ideally the processor/command will simply determine the appropriate control flow (i.e. within the command - or maybe the appropriate lifecycle if we end up using that) and associated business logic off the attributes of the object to be processed. I am not sure if our use case is typical, but to be clear we have a fair bit of business logic to be performed within the 'command', each in separate modules. In implementation, most modules will be interacting with the grid for lookups, etc. but ideally that will be abstracted from the Processor/Command which will only know that it is using an 'accountService' - for e.g.

      Currently the business logic is "loaded" into the listener and "passed on" to the command through composition. Ideally we ant the command would be light-weight and the various "processing models" would either:

      a) be deployed to each node and somehow "available" to the command during execution. Would need to work out how this would be come available to the execution environment; perhaps each 'Context' would wrap the processing details. However, even this is a bit too granular as likely a processing model will apply to many contexts.

      b) Perhaps the business logic/processing components are deployed to the cache itself. Then within the command attributes on the object would be consulted to determine which processing model to "apply" and a simple lookup could return the appropriate control flow/processor(s).

      c) Perhpaps the different logic/flow is embedded in a different "lifecycle" for the event processing and the appropriate lifecycle is detected by the listener and appropirately applied. Even with such a model we'd still like the various processing for each phase to be maintained in the server if possible.

      Has anyone else done something like this and/or are there any thoughts about deploying the business logic to the grid this way? I see advantages/disadvantages with the different solutions, and some of them seem better for upgrades. For example if you upgrade the processing logic whilst requests are still coming in (clearly you would attempt to avoid this) and it is embedded into each node, what would happen if one node has been upgraded and a request comes to that node. Say one of the business logic modules performs a query against the cache which needs to consult another node (e.g. assuming you're using partitioned data) and that node has not received the upgrade and there's a conflict. In that regard perhaps deploying the different processing logic to a replicated cache makes more sense because once updated it should get pushed immediately to all nodes?

      Are these known concerns? I'm new to grid-side processing concepts so just correct me if there's an obvious issue with tis.

      h4. 2. Cleanup/Management of contexts
      One thing I noticed on my prototype is that the context's that I create don't really go away. We are envisioning creating Many context per day (let's just say a few hundred million to be safe)

      so ...

      a) how do people normally remove the contexts? Does the command framework sort this out behind the scenes? I can see the 'stop' method on the CommandExecutor removing the context, but from a quick follow-through the only scenario which seems to potentially call this is if the context version number has changed. Is there some way to change the version when we submit additional commands to the same context?

      b) Is there an issue with creating this many Contexts? As per earlier mention, to reduce overhead ideally the context will not be too heavy but any thoughts on our intended usage? We could use something like a hashing scheme to "bucket" the requests to contexts to reduce the total number of Contexts if required but this is not ideal.

      h4. 3. Creation of new Command Every time.
      In our scenario, each command needs to act upon a given object (e.g. one account). As I see it, this requires us to create a new Command for each message, because I do not see a way to 'pass in' the object to the execute method. Setting it to the context does not work either because we need to queue a few requests to each given context; I played with wrapping the object with GenericContext and setting the value but in reality we're submitting the commands whilst others are currently being processed so I don't see how this could work.

      Any thoughts on this? Do you agree we'll have to create a new command for every message to be processed? We'll likely have millions of Commands per day so this will make a difference for us (although if we eliminate the logic from q#1 or the dependencies are singletons it's not a big deal)

      h4. 4. Concurrency guarantees with the commandpattern

      I also want to confirm my understanding of concurrency controls around the command pattern. Unlike an entry processor which controls updates to the entry upon which it was invoked, the command pattern only guarantees concurrency against processing occuring within the context of the currently operating command. Commands submitted to the same context will be processed synchronously but any entries which may have had a listener which spawned the command submission are in no way guarded. This latter point is pretty obvious I believe since there's no real link but I just want to make sure my assumptions are correct.

      NB: in the scenario I am describing we do NOT need to update the original cache entry into which the account message was submitted. Instead other caches will be updated with results from additional processing logic so this is not that much of an issue for us.

      h4. 5. Confirmation of concerns with "straight" entry processor
      If we were to use a "straight" entry processor (versus command pattern which uses entry processor) which gets kicked off from a threadpool on a backing map listener (for example on insert or update), is it true that if a node were to go down, we would have issues with failover? NB: The reason we would kick off the entry processor from a threadpool would be to "simulate" asynchronous processing. As I see it, if we kicked off a thread on the listener and returned back to the client, nothing would "re-submit" the request if a node goes down. Is that correct?

      ALTERNATIVELY, As I understand it, with an entry processor invoked from a client, it is the client coherence jar that receives the exception when a node goes down mid-process and the coherence jar takes care of "re-sending" the request to another node. So - if the threadpool is managed by the client and the client kicks off an invoke in one of the threads - then I believe the client WILL re-submit the entry processor requests if the node goes down - through the coherence jar/extend - not sure on the details but my point is that the client application does not have to provide any code for the "failover" but the coherence client jar performs this.

      h4. 6. Lifecycle
      I have not explored the "lifecycle" functionality available within the incubator - but as I understand it the main thing it could offer is that if we have many phases of the processing (as we do in most our use cases) - that the processing can be managed with the different lifecycles. NB: To be clear I am referring to 'live objects' with their own series of processing steps - not 100% if Lifecycle directly relates to 'live objects'. If a node goes down and is in the midst of processing 200,000 commands - the entire processing doesn't need to start over.. each request will need to go back to the previous completed phase of the lifecycle but may well avoid duplicated processing. All processing will need to be idempotent regardless, but lifecycles could avoid re-processing that was already complete.

      Is this correct?
      Other benefits?
      (e.g. configurable processing logic as alluded to in Q#1).
      Thanks very much

      Edited by: 822486 on 21-Dec-2010 16:23

      Edited by: 822486 on 21-Dec-2010 16:59
        • 1. Re: Feedback on use of incubator command pattern
          Hi User 822486,

          When delving into a detailed prototype like the one you have below it's often useful to understand the use cases and business requirements before jumping into a solution. I think it may be best for you to reach out to the Coherence organization within oracle to further discuss these questions in detail so we can better guide you in the different ways to solve problems with Coherence and the incubator. I'll do my best to comment on your prototype and address the questions that you currently have:

          NB: The functionality of this listener can be paralleled to the sample in CommandPatternExample for one submission. I am not entirely clear where command submissions typically "tie-in" but I am planning to kick them off from a backingmaplistener. I briefly explored using the 'com.oracle.coherence.common.events.dispatching.listeners.DelegatingBackingMapListener' to dispatch the commands but not entirely how this would tie in. As I understand it the delegating backingmaplistener is used within the 'liveobjects' context and dispatches entries that implement the LifecycleAwareEntry but not sure how we would create "custom-contexts" as we require (i.e. the identifier is not for the key of the cache entry but rather a subset of that -e.g. account id versus account message id).

          Command submissions are just that, submissions to the command pattern for execution and they can be triggered from anywhere since they run asynchronously. The DelegatingBackingMapListener and the associated eventing model provides you with the foundations for building an Event Driven Architecture on top of coherence. It's used by both the Push Replication Pattern as well as the Messaging Pattern which you could use as references if you wanted to go down the path of using the eventing model as well. It really comes down to your use case (which I don't have a lot of details on at the moment). An Entry that is a LifecycleAwareEntry can basically take action when it's state is changed (an event occurs). As a completely bogus example you could have a AccountMessageDispatcher object in a cache with a DelegatingBackingMapListener configured and you could submit EntryProcessors to this dispatcher that gives it a set of messages to perform for a set of accounts. The Dispatcher could then every time it's updated submit commands for execution. In essence it's formalizing an approach to responding to events on entries - or server side event driven programming.

          h2. Grid/server-side business logic deployment

          Have you looked at the processing pattern at all? It's a framework for building compute grids on top of Coherence and may have more plumbing in place for you to achieve what you're looking for. I think it may be best for us to discuss your use case in more detail to understand the pros and cons of each approach before commenting further on a solution for you.

          h2. Cleanup and Management of contexts

          Contexts are marker interfaces so they can be incredibly lightweight which should allow you to create as many of them as you need. The biggest concern is ensuring that you have enough processing power in your grid to handle the volume of work you want to manage. This should be a simple matter of figuring out your load and sizing your cluster appropriately. The initial design of the command pattern was to have a set of well established contexts that would be used repeatedly. Given that the Command Pattern is primarily an example, you could extend the DefaultContextsManager to have an unregisterContext method.

          h2. Creation of new command every time

          I'm a little confused by your requirement here. Are you saying that you have a set of pre-defined operations that you want to apply to an account for example incrementAccountBalancyBy1? If so, I don't understand why you couldn't submit the same command instance to a context multiple times. While I wouldn't recommend using statics you could have a CommandFactory that returned the same command each time you call getCommand once it was instantiated once. Usually however we expect that you'll have some additional data unique to each message that the command must execute. This could be handled by having a setter on your command for these properties.

          h2. Concurrency Guarantees

          The Command Pattern Guaranteees that for a given context commands are processed synchronously in the order they are received. If you have multiple submitters sending commands to the same context, then the order of when the commands are processed will be based on the order in which they arrive at the node where the Context resides. A context is the control point that gives commands their ordering.

          h2. Confirmation of concerns with "straight" entry processor

          I'm not sure if I follow your question here. EntryProcessors are guaranteed to execute, even in the failure scenario (this is why they're backed up and why they must be idempotent). If you're referring to processing events based on a backing map listener rather than submitting commands, it handles your processing then it's a matter of wether you're asynchronously processing the events or not. If you are synchronously processing things and your node dies while the BML is executing you're right a node failure at that point will result in "nothing happening" and the client will re-try. If however you're asynchronously handling the events from your BML, then you could lose state. This is why we use entries the way we do in the common event layer, we persist state on an entry that we can't lose when a node fails. This allows us to asynchronously process the data after the node has been updated.

          h2. Lifecycle

          With respect to lifecycle if you're referring to LifeCycleAwareEntry - this is a way of designating that an Entry in the cache can process events when modified/mutated. This may be better discussed by phone or in person.
          • 2. Re: Feedback on use of incubator command pattern
            Hi Noah,

            Thanks for the quick response.

            I agree it would be easier to discuss in person and get more details around our particular use cases. We were actually fortunate enough to have Oracle visit recently during high-level design discussions for our project, and as we understood it we should filter all questions through this forum, hence the post. More than happy to engage in comms in other mediums. I actually generalized the scenario and provided a hypothetical processing (accounts) but the challenges are accurate. I didn't lay these out in detail in my post so I will provide that context if we continue to discussions in this forum, otherwise perhaps in email or another measure. We are located in Australia (sydney) so I'm not sure if that changes any potential in-person meetings. Actually though I do think most of the questions we have (if i can clearly communicate them :)) may well apply to other teams so they might prove useful on this forum.

            h4. Grid/server-side business logic deployment
            No, haven't really looked at the processing pattern but I will. I guess generally this question is around the management/storage of grid-side processing logic. Clearly coherence promotes (and rightly so! :)) grid-side processing. As such I was wondering if there is much in place to store the processing logic in the grid as well - rather than having to "send" the logic to the grid each time. I will explore the processing pattern and see if this will help. We do have some very explicit ordering requirements so not sure if that will work but I'll check it out. I have ideas on how to handle this anyway (mentioned in my initial post) but didn't want to re-invent the wheel and wondered if there are any recommended/inbuilt solutions to house the logic because it seems like it would be a common concern in a grid-side processing model.

            h4. re: Command Submissions.
            Thanks for that. I will explore the LifecycleAwareEntry and/or perhaps use the DelegatingBackingMapListener. In reality the simple listener (extending AbstractMultiplexingBackingMapListener) met our requirements but I wasn't sure if there was a recommended way to dispatch commands. I'll probably start simple and evolve to a more managed model (e.g. Lifecycle) if required.

            h4. Cleanup and Management of contexts
            Great thanks. I thought we might need to manually clean up the contexts but just wanted to confirm. We'll look at extending or potentially having a nightly job clean up the contexts. Good to know they're incredibly light :) and we'll plan to just clean them up. I accidently overstated the quantity anyway.. should only be a few million per day (not few 100 mil:)).

            h4. Creation of new command every time
            Sorry must not have been clear. It's not a requirement per say but rather I just want to ensure we are not unnecessarily creating commands, particularly given the volumes coming through our system. Put another way - If we need a command to operate on a different instance of an object, we will need a new instance of a command for each "processing" of the object. Compare this to a singleton "service" class that takes an object as a method parameter, performs the processing, requiring only one service. If our commands are lightweight is a negligible concern and I think we can get them lightweight (i.e. only composed of the object instance on which it is operating for the most part) so perhaps we let this question go. NB: I don't think we'll have the option of "batch processing" as you alluded to [e.g. processing a list of accounts] due to ordering and attempts at perceived "real-time" processing but appreciate the thought and will ponder that for a bit.

            h4. Concurrency Guarantees
            Great thanks.

            h4. Confirmation of concerns with "straight" entry processor
            Great thanks. The scenario I was questioning was not an "issue" with an Entry processor per say - rather a combination of coherence features to avoid really.. asynchronously handling events from a BML without using commands. As I see it, this is an advantage of using the command pattern - it allows you to guarantee execution once the command is "submitted" where as the state might be lost on a "straight" entry processor if you've already returned back to the client
            To be clear (hopefully :)), if the events are not asynchronously dispatched from a BML, but rather from the client itself (through a threadpool on the client) - then I believe the entry processing is guaranteed - because the client will effectively re-launch the processor.
            In reality, I am planning to use the command pattern so it's not an issue but I wanted to make sure my understanding was correct.

            Putting it in code (total pseudocode just conceptually showing the "problematic" scenario) - having this in a backing map listener could result in the inability of the entry processor to complete:

            private TaskExecutor taskExecutor;

            public void onBackingMapEvent(MapEvent mapEvent, Cause cause) {
            taskExecutor.execute(new Runnable() {
            public void run() {
            try {
            CacheFactory.getCache("cache_name").invoke(mapEvent.getKey().getId(),new EntryProcessor());


            If the node goes down during processing of the EntryProcessor, the client has already received the response and nothing will "re-dispatch" the processor- correct?

            Edited by: 822486 on 22-Dec-2010 14:28

            Edited by: 822486 on 22-Dec-2010 15:21
            • 3. Re: Feedback on use of incubator command pattern
              Hey User 822486,

              I do think there would be value in having some follow up conversations with you to further understand your use cases and how they apply to the incubator patterns. In particular I'd like to evaluate in more detail whether the command pattern, the processing pattern, or some hybrid approach elements of both would be the right solution to your problem set. It may be beneficial to communicate via email to see if we might find a good time to discuss your requirements in more detail.

              h2. Grid/server-side business logic deployment

              The processing pattern is a compute grid API built on top of Coherence. It's a framework much like the messaging and push replication patterns, and was the natural evolution of the command/functor pattern. It takes grid-side processing to the next level. An alternative approach might be to build an EDA to solve your problems. This can be achieved by using lifecycle aware objects that respond to updates in the grid itself. The messaging pattern uses this as the foundation for passing messages around the system. This is where understanding your requirements can help me to guide you to the best solution.

              h2. re: Command Submissions

              I am a fan of starting simple and adding complexity as needed. It sounds like you're on the right track.

              h2. Creation of a new command everytime

              Even without batch processing, the reality is that you're serializing each entryprocessor/command to the server anyway for processing which will result in a new instance unless you follow some of the EDA practices I've mentioned above where you can place an account in a queue to be processed server side. Even then the best practice for doing this would be to use an EntryProcessor to place it on the queue on the server. This is where the rubber meets the road and the devil is in the details. As mentioned above understanding your use case would be valuable.

              h2. Confirmation of concerns with "straight" entry processor

              The approach you've written below is actually dangerous for a few reasons. Yes if the node goes down the retry logic for the invoke would not run, but yo've got a second problem on your hand as well. If the cache that your'e trying to run that invocable against happens to be running in the same service as your BackingMapListener you could wind up running out of threads to process requests. In essence you shouldn't make calls through the NamedCache API on the server unless you're certain the caches are running on different services, since requests wind up running on service threads (and these calls are not re-entrant).

              There are ways of building up asynchronous event driven processing models on top of Coherence that have ordering guarantees. The Processing Pattern is one, the Command Pattern is another, what we've done with the Messaging Pattern is a third. Finally the eventing model we've got in coherence-common is a good foundation for building these solutions out. It may be helpful for you to not only look at the DelegatingBackingMapListener but also the events packages within common. The messaging pattern was built on top of these.
              • 4. Re: Feedback on use of incubator command pattern
                Thanks again Noah.

                I'm actually not sure of the email address/details but I imagine my manager might. He's off for another week so I'll check with him when he's back.

                I'm definitely keen to receive advice, so if a custom-EDA solution is the go I'll take a look at the events package and DelegatingBackingMapListener in more detail and get your advice once I summarize our requirements in more detail in email. My current solution of submitting commands from the BML is working but I did see that I would probably need to extend the solution to provide error handling when exceptions are raised after the command is submitted and being executed. I imagine this would be easy to tailor in a custom-solution (although I didn't think it was going to be too hard with the command pattern either but haven't gotten to trialling a solution for that yet).

                And yes - very aware of that re-entrancy scenario! ;) Currently we are making namedCache calls within our command and it works because the commands end up on DistributedCacheForCommandPatternDistributedCommands service as i understand it (we're using ManagementStrategy.DISTRIBUTED). Prior to that I was using two different services when I was trialling straight entry processors from the client but was aware of the risks. I did write directly to the backingmap for some writes which worked but I quickly saw we needed filtered queries so I opted for the different services for the cache on which the BML was listening and a different cache to which I was writing. I put some comments in the config about the risks of changing the service name as the effects of that are not obvious but clearly that's not ideal.

                Hopefully chat soon via email.

                btw - happy New year! well.. technically tomorrow for you if my assumption that you're in the states is correct. :)