Do all processes, which update the environment in a Replication Manager application, “run” Replication manager, and, by implication, cannot spawn subprocesses (even if these subprocesses don't try to open the environment)?
Yes, any process which updates the environment "runs" Replication Manager and therefore cannot spawn subprocesses, regardless of whether the spawned subprocesses themselves attempt to open the environment.
Alternatively, is it just the first process that calls repmgr_start() and receives the replication notification events, the only one that "runs" Replication Manager?
There is the main replication process (aka the listener process) that receives replication notification events and that process should have called repmgr_start() explicitly. Other processes may call repmgr_start() as well, and will be subordinate processes. If one of these other processes updates the environment but did not call repmgr_start() explicitly, we will start Replication Manager in it behind the scenes, so it is still considered to be running Replication Manager.
I'm sorry, but our restriction against spawning a subprocess from a process running Replication Manager needs to stand. We have encountered a variety of issues when combining multi-process programming with multi-threaded programming, and of course, Replication Manager is multi-threaded. We do have an item on our list of future tasks to look into this further and see if we can eliminate or reduce this restriction, but I can't make any commitment about when we might do this or what our conclusion will be.
Thanks very much Paula. I can figure out how to design around the restriction by introducing another layer between the spawning processes.
I am struggling a bit trying to figure it out what to do with the watcher though. This clearly can't have an open (replication enabled) environment handle and spawn the processes that it watches over. This would cause an issue with the watcher calling failchck() and registering the associated handler as per the suggestions in the reference guides.
Is it possible to do this with Replication Manager?
We need to look into this in more detail and it will be several weeks before we can get to it. We will reply with more information about what is possible in Replication Manager when we have done this.
But my initial look at this gave me a few pieces of information which are worth mentioning now:
- You should seriously consider upgrading to Berkeley DB 6.0. We did some work in that release that makes the use of failchk with Replication Manager much more reliable.
- It may be OK that your watcher process opens an environment handle and calls failchk(), but we cannot be sure about this until we look into it.
- It is definitely NOT OK for your watcher process to open/close database handles or to perform any transactions or checkpoints. These are things that cause us to start Replication Manager behind the scenes and would subject your watcher process to the restriction against spawning subprocesses.
I am following up on the investigation I said we would do in my previous reply.
We have been able to confirm through internal testing that it is possible to create a watcher process that opens and closes environment handles, runs recovery and uses failchk without ever invoking Replication Manager. If you limit your watcher process to these Berkeley DB operations, we can support it spawning subprocesses that themselves use Replication Manager.
The following operations should only be done in the spawned subprocesses: starting Replication Manager, opening and closing database handles, and performing get/put operations, transactions and checkpoints. Subprocesses that perform any of these operations cannot spawn further subprocesses because they would be running Replication Manager.