2 Replies Latest reply on Nov 12, 2010 1:45 AM by 812732

    how to detect a replication client has quit?

      I am using db-5.0.21, and want to use Replication Manager to mange my replication.

      My application context is that the replication clients are changing, and I can control
      which node(client) should quit. But how to tell the Replication Manager, so it won't
      keep trying to connect to the gone client?

      how to stop this msg:
      " +ex_rep_mgr: connecting to site localhost:6001: Connection refused+ "

      There is only a function +"DB_ENV->repmgr_add_remote_site()"+ to add a site, is there
      also function like +"DB_ENV->repmgr_rem_remote_siet()"+ to remove a site?

      From the view point of repmgr, is there a way to detect a client's quit and report this
      event to application?

      Edited by: 809729 on Nov 10, 2010 6:38 PM

      Edited by: 809729 on Nov 10, 2010 6:40 PM

      Edited by: 809729 on Nov 10, 2010 6:40 PM
        • 1. Re: how to detect a replication client has quit?
          Unfortunately, at the moment, the answer to both of these questions is
          no. However, we are planning significant improvements in this area
          for a future release, which will make both of these operations

          For now the only way to make Replication Manager stop trying to
          connect to a removed site is to close the environment handle, and
          reopen and restart (without mentioning the removed site in a call to
          repmgr_add_remote_site() again, of course). You can shut down and
          restart one surviving site at a time, so that the replication group as
          a whole remains available during this process.

          Note too that you can adjust how often repmgr tries to reopen a
          connection (rep_set_timeout(DB_REP_CONNECTION_RETRY)), so that you can
          reduce, if not eliminate, the annoyance.

          As for detecting the disappearance of a client, repmgr prints an error
          message when it loses a connection. If you were really determined,
          you could install an error message callback function using
          DB_ENV->set_errcall() and examine the message text. You would
          probably want to check for at least a couple of different possible
          error messages: "can't read from %s", "EOF on connection from %s",
          "socket writing failure".

          Also note that DB_ENV->repmgr_site_list() reports the connection
          status of all known remote sites.

          Alan Bram
          1 person found this helpful
          • 2. Re: how to detect a replication client has quit?
            Thanks a lot.
            I'll try to walk around the problem with your helpful suggestion :)