3 Replies Latest reply: Jul 31, 2013 5:44 PM by Paula B-Oracle RSS

    Got err message: Environment not configured as replication master or client

    4246bac3-3a09-4031-8891-51bb8a451ffc

      Hi BDB experts,

       

      I am writing db HA application based on bdb version 4.6.21. Two daemons run on two machines, one as master which will read/write db, one as backup will only read db. I often got err message below when master is writing db while backup is syncing db from master (i.e. at the time just start up):

      "Got err message: Environment not configured as replication master or client

      DB_ENV->rep_process_message: Invalid argument". This messages would be printed many times then below message "DB_ENV->rep_process_message: DB_REP_NOTPERM: Permanent log record not written" printed many times.

       

      I found if the master is not writing db, backup app will not get any error messages. This only happens when master is writing db and backup syncing db at the meantime. I am sure I have set one environment as DB_REP_CLIENT, and the other as DB_REP_MASTER. Could you help me on this issue?

       

      Brs,

      Min

       

       

       

       

      "

        • 1. Re: Got err message: Environment not configured as replication master or client
          Paula B-Oracle

          Are you using Replication Manager? My discussion below assumes you are not, but it is important for me to know if you are.

           

          I would expect to see the "Environment not configured as replication master or client" error in a situation where you call rep_process_message() before you call rep_start() or before your call to rep_start() has finished.

           

          There are several ways you can get the "DB_REP_NOTPERM" error, but it generally means that a client has accepted an incoming permanent log record (e.g. a commit) but has not yet been able to apply it to the client database. This is usually not an error but merely an indication that log records are arriving at a client out of order (which our processing eventually resolves automatically.)

           

          I'm not 100% clear from your description where you are seeing these errors. On which site are you seeing the "Environment not configured..." errors? Are you seeing the NOTPERM errors on the same site?

           

          It would be helpful if you can provide more detail about the replication calls you are making in your startup sequence on each of your sites. Do you start up the sites sequentially (if so, which order?) or at the same time? Please include include information about which calls you are making in which threads if you are using the Base Replication API.

           

          Paula Bingham

          Oracle

          • 2. Re: Got err message: Environment not configured as replication master or client
            4246bac3-3a09-4031-8891-51bb8a451ffc

            1) No. I am using base API, not rep manager.

            2) Do you mean the error msg implies that rep_process_message() before you call rep_start() or before your call to rep_start() has finished?

            It may happen for my code. Since I create a thread that in which it will connect master(tcp server) then loop doing recv msg and call rep_process_message. Just after pthread_create, I sleep 2 seconds then call rep_start().

            I remember once I put rep_start first then created that thread, there would be some issue also. What is the right logic?

            3) So usually for "DB_REP_NOTPERM", the app needn't do anything since it will be solved eventually automatically, right?

            4) These errors only seen on backup site, and yes, NOTPERM errors on the same site after dozens of  message "Environment not configured as replication master or client DB_ENV->rep_process_message: Invalid argument".

            5) In this test case, I start the master site first then let it loop putting record to db. Then I start the client site. There are only two sites in my HA environment.

            base api call sequence on master:

            (in main thread)

            dbenv->rep_set_nsites

            dbenv->rep_set_priority

            dbenv->rep_set_transport

            create tcp listen socket

            dbenv->rep_start

            pthread_create(master_rep_thread)

            (in master_rep_thread)

            it is a while loop, in which check

            if (client connected)

              call dbenv->rep_process_message;

            else

              accept connection;

             

             

            base api call sequence on client:

            dbenv->rep_set_nsites

            dbenv->rep_set_priority

            dbenv->rep_set_transport

            pthread_create(client_rep_thread)

            dbenv->rep_start

            (in client_rep_thread)

            it is a while loop, in which check

            if (master connected)

              call dbenv->rep_process_message;

            else

              connect master;

             

            Thanks,

            Min

            • 3. Re: Got err message: Environment not configured as replication master or client
              Paula B-Oracle

              2) Do you mean the error msg implies that rep_process_message() before you call rep_start() or before your call to rep_start() has finished?

              It may happen for my code. Since I create a thread that in which it will connect master(tcp server) then loop doing recv msg and call rep_process_message. Just after pthread_create, I sleep 2 seconds then call rep_start().

              I remember once I put rep_start first then created that thread, there would be some issue also. What is the right logic?

               

              The rep_start method creates the internal replication context and structures and needs to be called before rep_process_message.

               

              3) So usually for "DB_REP_NOTPERM", the app needn't do anything since it will be solved eventually automatically, right?

               

              Yes, that's usually the case. If there were some DB_REP_NOTPERM messages and then there is a major issue (site goes down, communications slow or down), they will not be resolved until the issue is fixed.

               

              Paula Bingham

              Oracle