4 Replies Latest reply: May 12, 2014 10:55 AM by Paula B-Oracle RSS

    Spawning subprocesses in a Replication Manager environment

    949491

      Hi there,

       

      We've built a transactional DB application as a few multi-threaded read/write processes, spawned by a “watcher” process. This coordinates recovery and runs failchk / process restarts etc as per the Reference Guide (p146- related processes). It's working really well and now we would like to turn the system into a HA Replication Manager application.

       

      A couple of our processes, which open their databases environment with read/write, occasionally spawn short lived subprocesses.  These subprocesses call some external services and do not touch the Berkeley Db Environment in any way.   I'm not sure if this is still going to be possible under Replication Manager though. The Reference Guide states on p208 (in programming considerations): 'It is not supported for a process running Replication Manager to spawn a subprocess'.


      Do all processes, which update the environment in a Replication Manager application, “run” Replication manager, and, by implication, cannot spawn subprocesses (even if these subprocesses don't try to open the environment)? Alternatively, is it just the first process that calls repmgr_start() and receives the replication notification events, the only one that "runs" Replication Manager?

       

      Hope that makes sense. I'm on db-5.3.21 running Ubuntu 12.04 LTS.

       

      Thanks very much

      Dave

        • 1. Re: Spawning subprocesses in a Replication Manager environment
          Paula B-Oracle

          Do all processes, which update the environment in a Replication Manager application, “run” Replication manager, and, by implication, cannot spawn subprocesses (even if these subprocesses don't try to open the environment)?

          Yes, any process which updates the environment "runs" Replication Manager and therefore cannot spawn subprocesses, regardless of whether the spawned subprocesses themselves attempt to open the environment.

           

          Alternatively, is it just the first process that calls repmgr_start() and receives the replication notification events, the only one that "runs" Replication Manager?

          There is the main replication process (aka the listener process) that receives replication notification events and that process should have called repmgr_start() explicitly. Other processes may call repmgr_start() as well, and will be subordinate processes. If one of these other processes updates the environment but did not call repmgr_start() explicitly, we will start Replication Manager in it behind the scenes, so it is still considered to be running Replication Manager.

           

          I'm sorry, but our restriction against spawning a subprocess from a process running Replication Manager needs to stand. We have encountered a variety of issues when combining multi-process programming with multi-threaded programming, and of course, Replication Manager is multi-threaded. We do have an item on our list of future tasks to look into this further and see if we can eliminate or reduce this restriction, but I can't make any commitment about when we might do this or what our conclusion will be.

           

          Paula Bingham

          Oracle

          • 2. Re: Spawning subprocesses in a Replication Manager environment
            949491

            Thanks very much Paula.  I can figure out how to design around the restriction by introducing another layer between the spawning processes.

             

            I am struggling a bit trying to figure it out what to do with the watcher though.  This clearly can't have an open (replication enabled) environment handle and spawn the processes that it watches over.  This would cause an issue with the watcher calling failchck() and registering the associated handler as per the suggestions in the reference guides. 

             

            Is it possible to do this with Replication Manager? 

            • 3. Re: Spawning subprocesses in a Replication Manager environment
              Paula B-Oracle

              We need to look into this in more detail and it will be several weeks before we can get to it. We will reply with more information about what is possible in Replication Manager when we have done this.

               

              But my initial look at this gave me a few pieces of information which are worth mentioning now:

              1. You should seriously consider upgrading to Berkeley DB 6.0. We did some work in that release that makes the use of failchk with Replication Manager much more reliable.
              2. It may be OK that your watcher process opens an environment handle and calls failchk(), but we cannot be sure about this until we look into it.
              3. It is definitely NOT OK for your watcher process to open/close database handles or to perform any transactions or checkpoints. These are things that cause us to start Replication Manager behind the scenes and would subject your watcher process to the restriction against spawning subprocesses.

               

              Paula Bingham

              Oracle

              • 4. Re: Spawning subprocesses in a Replication Manager environment
                Paula B-Oracle

                I am following up on the investigation I said we would do in my previous reply.

                 

                We have been able to confirm through internal testing that it is possible to create a watcher process that opens and closes environment handles, runs recovery and uses failchk without ever invoking Replication Manager. If you limit your watcher process to these Berkeley DB operations, we can support it spawning subprocesses that themselves use Replication Manager.

                 

                The following operations should only be done in the spawned subprocesses: starting Replication Manager, opening and closing database handles, and performing get/put operations, transactions and checkpoints. Subprocesses that perform any of these operations cannot spawn further subprocesses because they would be running Replication Manager.

                 

                Paula Bingham

                Oracle