This discussion is archived
9 Replies Latest reply: Apr 21, 2011 11:03 AM by 524761 RSS

What causes that HA throws error "local site address has not yet been set"

776130 Newbie
Currently Being Moderated
HA works well util we update Berkely DB from 5.0 to 5.1.25.

But now,we often get the error "local site address has not yet been set".

Who can tell we what happen?
  • 1. Re: What causes that HA throws error "local site address has not yet been set"
    524761 Journeyer
    Currently Being Moderated
    That error message comes from the DB_ENV->repmgr_get_local_site() method, which is new in 5.1. The application should only call this method after the local site address has been defined, by calling DB_ENV->repmgr_set_local_site() (or the equivalent entry in the DB_CONFIG file).
  • 2. Re: What causes that HA throws error "local site address has not yet been set"
    524761 Journeyer
    Currently Being Moderated
    However, if you're using the Java API, it makes this call internally whenever the application calls Environment.getConfig() or Environment.setConfig(); so that's a bug in Berkeley DB. Other than the annoyance of the unwanted error message, it is harmless.
  • 3. Re: What causes that HA throws error "local site address has not yet been set"
    776130 Newbie
    Currently Being Moderated
    Dear Alan,

    the application is of C++.

    The error cause that our application can't run. Do you have some advice for it?

    Thanks
  • 4. Re: What causes that HA throws error "local site address has not yet been set"
    524761 Journeyer
    Currently Being Moderated
    Is the application calling DbEnv::repmgr_get_local_site()?
  • 5. Re: What causes that HA throws error "local site address has not yet been set"
    776130 Newbie
    Currently Being Moderated
    The application nerver invoke explicitly the method get_local_site.

    But before calling the method set_local_site, what the application does as below.

         m_envFlags = m_envFlags |
              DB_CREATE | // Create the environment if it does not exist
              DB_RECOVER | // Run normal recovery.
              DB_INIT_LOCK | // Initialize the locking subsystem
              DB_INIT_LOG | // Initialize the logging subsystem which provides a high-degree of recoverability when application crashes.
              DB_INIT_TXN | // Initialize the transactional subsystem. This also turns on logging.
                             // wyb: recovery requires transaction support, so DB_INIT_TXN is a must
              DB_THREAD | // Cause the environment to be free-threaded //not ok when db->get() , but ok when dbcursor->get()
              DB_INIT_MPOOL; // Initialize the memory pool (in-memory cache)
              
         m_env = new DbEnv(0);
         if (m_env == NULL) {
              m_logger.error("new DbEnv error: %m");
              throw OWException("new DbEnv error");
         }

         if (shm_key == -1) {   
              printf("create shm_key failed!\n");
              exit(1) ;
         }

         m_env->set_errpfx("owbdb");
         m_env->set_errcall(DBEnv::errorHandler); //bdb errors will be sent to the callback function.
         m_env->set_msgcall(DBEnv::msgHandler); //bdb msgs will be sent to the callback function.

         m_env->set_flags(DB_TXN_NOSYNC, 1);

         m_env->set_lg_max(10*1024*1024); //disk log size (default: 10M)

         m_env->set_cachesize(0, m_cacheSize, 0);

         uint32 logSize = 32*1024; //default: 32*1024, ori: m_cacheSize
         m_env->set_lg_bsize(logSize);
         m_env->set_lk_max_lockers(20000);
         m_env->set_lk_max_objects(20000);
         m_env->set_lk_max_locks(20000);

         // set the maximum number of simultaneous transactions
         m_env->set_tx_max(10000);

              //m_env->set_app_private(&m_appData);
              m_env->set_event_notify(DBEnv::eventCallback);

              // ack policy can have a great impact in performance, lantency and consistency
              //m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_ONE_PEER); //ori: DB_REPMGR_ACKS_ALL
              m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_NONE);
              //m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_QUORUM);

              // timeout configs
              m_env->rep_set_timeout(DB_REP_ACK_TIMEOUT, 50 * 1000); //50ms
              m_env->rep_set_timeout(DB_REP_CHECKPOINT_DELAY, 0);
              m_env->rep_set_timeout(DB_REP_CONNECTION_RETRY, 30 * 1000 * 1000); // 30 seconds
              m_env->rep_set_timeout(DB_REP_ELECTION_TIMEOUT, 5 * 1000 * 1000); // 5 seconds
              m_env->rep_set_timeout(DB_REP_ELECTION_RETRY, 10 * 1000 * 1000); //10 seconds

              m_env->rep_set_timeout(DB_REP_HEARTBEAT_MONITOR, 80 * 1000 * 1000); //80 seconds
              m_env->rep_set_timeout(DB_REP_HEARTBEAT_SEND, 60 * 1000 * 1000); //60 seconds

              m_env->rep_set_priority(priority);     

              uint32 rep_req_min = 40000;
              uint32 rep_req_max = 1280000;
              uint32 rep_limit_gbytes = 0;
              uint32 rep_limit_bytes = 10 * 1024 * 1024; // 10MB
              m_env->rep_set_request(rep_req_min, rep_req_max);
              m_env->rep_set_limit(rep_limit_gbytes, rep_limit_bytes);

              // set local site
              if ((ret = m_env->repmgr_set_local_site(localIp.c_str(), localPort, 0)) != 0) {
                   string strError = "Could not set bdb local on " + localIp + ":" + toStr(localPort) + " : ";
                   m_env->err(ret, strError.c_str());
                   throw OWException(strError);
              }
  • 6. Re: What causes that HA throws error "local site address has not yet been s
    524722 Explorer
    Currently Being Moderated
    It looks like your m_envFlags is missing DB_INIT_REP.

    Sue LoVerso
    Oracle
  • 7. Re: What causes that HA throws error "local site address has not yet been set"
    524761 Journeyer
    Currently Being Moderated
    user13177882 wrote:
    The application nerver invoke explicitly the method get_local_site.
    In that case it is a mystery how it got invoked, since AFAIK it is never invoked internal (other than the Java API as I mentioned earlier).

    Therefore, can you please try running the program in a debugger, with a breakpoint on the repmgr_get_local_site function at the point where that message is produced, and when you hit the breakpoint print a stack trace?
  • 8. Re: What causes that HA throws error "local site address has not yet been set"
    776130 Newbie
    Currently Being Moderated
    I have debug source code.

    And adjust my code.

    m_env->set_event_notify(DBEnv::eventCallback);
    m_env->rep_set_config(DB_REP_CONF_BULK, 1);
    m_env->set_verbose(DB_VERB_REPLICATION, 1);
    // set local site
    im_env->repmgr_set_local_site(localIp.c_str(), localPort, 0))
    // set remote site
    m_env->repmgr_add_remote_site(peerIp.c_str(), peerPort, NULL, 0))

    uint32 rep_req_max = 1280000;
    uint32 rep_limit_gbytes = 0;
    uint32 rep_limit_bytes = 10 * 1024 * 1024; // 10MB
    m_env->rep_set_request(rep_req_min, rep_req_max);
    // timeout configs
    m_env->rep_set_priority(priority);
    m_env->rep_set_nsites(2);


    setEnvParameters();
    m_envFlags = m_envFlags |DB_INIT_REP;
    m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_ALL);
    m_env->rep_set_timeout(DB_REP_ACK_TIMEOUT, 50 * 1000); //50ms
    // m_env->rep_set_timeout(DB_REP_CHECKPOINT_DELAY, 0);
    // m_env->rep_set_timeout(DB_REP_CONNECTION_RETRY, 30 * 1000 * 1000); // 30 seconds
    // m_env->rep_set_timeout(DB_REP_ELECTION_TIMEOUT, 5 * 1000 * 1000); // 5 seconds
    // m_env->rep_set_timeout(DB_REP_ELECTION_RETRY, 10 * 1000 * 1000); //10 seconds
    //m_env->rep_set_priority(priority);
    m_env->rep_set_limit(rep_limit_gbytes, rep_limit_bytes);

    m_logger.info("bdb environment, db home dir: %s, m_envFlags: %x ", m_homeDir.c_str(), m_envFlags);
    int ret = m_env->open(m_homeDir.c_str(), m_envFlags, 0);
    m_logger.info("open db env(%s) return %s", m_homeDir.c_str(), (ret==0 ? "ok":toStr(ret).c_str()) );
    if ((ret = m_env->repmgr_start(3, startPolicy)) != 0) {
    string strError = "bdb repmgr_start failed: " + toStr(db_strerror(ret));
    m_logger.error(strError.c_str());
    throw OWException(strError);
    }

    when the application invoke the methods m_env->rep_set_config(DB_REP_CONF_BULK, 1), m_env->repmgr_set_ack_policy(DB_REPMGR_ACKS_ALL) m_env->rep_set_priority(priority) m_env->rep_set_timeout(DB_REP_ACK_TIMEOUT, 50 * 1000) and so on, the application crash because it get SIGSEVG.
  • 9. Re: What causes that HA throws error "local site address has not yet been set"
    524761 Journeyer
    Currently Being Moderated
    Where did the SEGV occur? Can you provide a stack trace, please?

    What do you mean by "and so on"? Which one resulted in the SEGV?

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points