1 Reply Latest reply: Apr 10, 2014 4:49 PM by Paula B-Oracle RSS

    Berkeley DB C api repmgr_start fails with DB_REP_UNAVAIL

    2605248

      I want to create 2 site replication deployment with Berkeley DB. One site will be master and other will be salve. But in the initial development phase I am not deploying slave site (machine). Only master site is there.

      When I deploy master site, master comes up smoothly and I am able to do DB operations (Put/get/del). But if I re-start (kill the process and start again) the application on master, then the dbenv->repmgr_start() call FAILS with "BDB0086 DB_REP_UNAVAIL: Too few remote sites to complete operation"

      I am not getting, why this call fails? It happens for every re-start.

      Below is my berkeley DB initialization code. I have followed the code given in the repmgr example code.

      I have defined all the variable in the header file and the declaration is not mentioned here

      DB_ENV *db_env;

      #define DB_ENV_HOME "/opt/bdb-db"

      int main () {

        ret = db_env_create(&db_env, 0);
        if (ret != 0) {
        return ret;
        }

        db_env->set_errfile(db_env, db_err_fd);
        db_env->set_event_notify(db_env, db_event_callback);
        ret = db_env->repmgr_set_ack_policy(db_env,DB_REPMGR_ACKS_QUORUM);
        if (ret != 0) {
        return ret;
        }
        db_env->rep_set_request(db_env, 20000, 500000);
        db_env->set_lk_detect(db_env, DB_LOCK_DEFAULT);

        ret = db_env->repmgr_site(db_env, ip_addr,LOCAL_PORT, &dbsite, 0);
        if (ret !=) {
        return ret;
        }
        dbsite->set_config(dbsite, DB_LOCAL_SITE, 1);
        db_env->rep_set_timeout(db_env, DB_REP_HEARTBEAT_SEND,5000000);
        db_env->rep_set_timeout(db_env, DB_REP_HEARTBEAT_MONITOR,10000000);
        db_env->set_cachesize(db_env, 0, DB_ENV_CACHE_SIZE, 0);
        db_env->set_flags(db_env, DB_TXN_NOSYNC, 1);
        flags = 0;
        flags = DB_CREATE | DB_INIT_LOCK | DB_INIT_LOG | DB_INIT_MPOOL |
        DB_INIT_REP | DB_INIT_TXN | DB_RECOVER | DB_THREAD;

        ret = db_env->open(db_env, DB_ENV_HOME, flags, 0);
        if (ret != 0) {
        return ret;
        }

        ret = db_env->repmgr_start(db_env, 3, DB_REP_MASTER);
        if (ret != 0) {
        retrun ret;
        }
        ...
        ...
        ...
      }

       

      Can someone please help me to understand this behavior ?

        • 1. Re: Berkeley DB C api repmgr_start fails with DB_REP_UNAVAIL
          Paula B-Oracle

          What version of Berkeley DB are you using?

           

          I don't see any evidence in your sample code that you are following the procedure for a primordial startup. The very first time that you start your first replication manager site, you need to configure it as DB_GROUP_CREATOR as well as DB_LOCAL_SITE. You need to do this to create the initial version of our internal group membership database. Later startups for this site do not need to configure DB_GROUP_CREATOR, but if it is specified it is ignored. I suspect that your later restarts are returning DB_REP_UNAVAIL because they are looking for our internal group membership database and not finding it.

           

          In your case, you should start again from scratch with an empty environment directory. Add a second call to dbsite->set_config() to also configure DB_GROUP_CREATOR before you call db_env->repmgr_start(). See if this changes the behavior when you later try to restart your master site.

           

          You can find more information about the primordial start procedure in the Berkeley DB Programmer's Reference Guide section "Managing Replication Manager group membership".

           

          Paula Bingham

          Oracle