6 Replies Latest reply: Jan 13, 2014 9:09 AM by 981760 RSS

    Redo Transport problem with SYNC option

    981760

      Hello Experts,

       

      I have some unique issue with redo transport.

       

      I am actually trying to use Dataguard redo shipment for Oracle Streams.

       

      If I am using ASYNC redo transport ,my environment is working fine, but if I change it to SYNC then I see the error from LGWR.

       

      Basically , LGWR will not write to the standby redo logs and throws me the below error from the LNS log files.

       

      Redo shipping client performing standby login

      *** 2014-01-06 10:14:39.091 68920 kcrr.c

      Logged on to standby successfully

      Client logon and security negotiation successful!

      *** 2014-01-06 10:14:39.091 61296 kcrr.c

      LNSb: connect status = 0

      *** 2014-01-06 10:14:39.111

      ksedmp: internal or fatal error

      ORA-00600: internal error code, arguments: [kcrrnsfwa.15], [4294967104], [512],

      [], [], [], [], []

      ----- Call Stack Trace -----

      calling call entry argument values in hex

      location type point (? means dubious value)

      -------------------- -------- -------------------- ----------------------------

      ksedmp()+728 CALL ksedst() 000000017 ?

      FFFFFFFF7FFFCEBC ?

      000000000 ?

       

       

      My understanding is during SYNC , LGWR should write the redo log files using LNS and RFS recieves those messages at Standby site and writes it to the standby log file.

       

      My standby log files are of same size as online redo log files and I also have the same orapwd transferrred.

       

      Even though it makes me believe that issue is with network , when I contacted the nework people they say everything looks good.

       

      Another strange thing, I have the setup working in different server , its only with this server , SYNC does not work, also I tried with different versions of ORacle in the same box and none of them would be successfully able to write to the standby logfiles.

       

      Any suggestions with this issue is highly appreicated.

       

      Thanks,

      Krishna

        • 1. Re: Redo Transport problem with SYNC option
          896971

          Always remember to please provide your Operating System version and your Oracle version. Are you using file-system or ASM on your servers? Provide any other errors as well and let us know what server they are from (primary or standby).

           

          Run select name,open_mode,database_role,protection_mode from v$database; and SELECT dest_name,destination,recovery_mode,PROTECTION_MODE FROM v$archive_dest_status; and  SHOW PARAMETER LOG_ARCHIVE_DEST_ and report back with what it says from all servers.

          In particular I am interested in what the NET_TIMEOUT is set to.

           

          981760 wrote:

           

          If I am using ASYNC redo transport ,my environment is working fine, but if I change it to SYNC then I see the error from LGWR.

           

          What it sounds like might be happening is that your primary is submitting a redo log and waiting for a response from the standby. When the primary does not get the response in time, it may be timing out and moving on.

           

          Regarding the network. Know that there is a difference between bandwidth (how fat your pipe is) and latency (how fast the water moves through the pipe). Check for latency issues by utilizing ping, tracert, and tcpdump. It could also be IO issues local to the standby. Check that with a utility like iostat. Easy way is to run the same tests on all servers and compare.


          Read about the "log transport steps" here: http://www.dba-oracle.com/t_lns_wait_on_sendreq.htm

          • 2. Re: Redo Transport problem with SYNC option
            981760

            Hello,

             

            Thank you for your response and valuable time.

             

            Here are the details.

             

            Operating System Version : Solaris 10 on both primary and secondary

            Are you using file-system or ASM on your servers -- I am using File system

            Oracle Version : Primary DB - Oracle 10.2.0.5
                             Secondary DB - Oracle 10.2.0.5


            Primary Details :


            SQL> select name,open_mode,database_role,protection_mode from v$database
              2  ;

            NAME      OPEN_MODE  DATABASE_ROLE    PROTECTION_MODE
            --------- ---------- ---------------- --------------------
            MXACPNUC  READ WRITE PRIMARY          MAXIMUM PERFORMANCE


            SQL> SELECT dest_name,destination,recovery_mode,PROTECTION_MODE FROM v$archive_dest_status;

            DEST_NAME
            --------------------------------------------------------------------------------
            DESTINATION
            --------------------------------------------------------------------------------
            RECOVERY_MODE           PROTECTION_MODE
            ----------------------- --------------------
            LOG_ARCHIVE_DEST_1
            /usr/local/oracle/admin/mxacpnuc/arch
            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_2
            strm10r
            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_3

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_4

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_5

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_6

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_7

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_8

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_9

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_10

            IDLE                    MAXIMUM PERFORMANCE


            10 rows selected.

             

             

            SQL> show parameter log_Archive_dest_;

            NAME                                 TYPE
            ------------------------------------ --------------------------------
            VALUE
            ------------------------------
            log_archive_dest_1                   string
            LOCATION=/usr/local/oracle/adm
            in/mxacpnuc/arch
            log_archive_dest_10                  string

            log_archive_dest_2                   string
            SERVICE=strm10r LGWR SYNC
            VALID_FOR=(ONLINE_LOGFILE,PRIM
            ARY_ROLE)
            DB_UNIQUE_NAME=strm10r
            log_archive_dest_3                   string

            log_archive_dest_4                   string

            log_archive_dest_5                   string

            log_archive_dest_6                   string

            log_archive_dest_7                   string

            log_archive_dest_8                   string

            log_archive_dest_9                   string

            log_archive_dest_state_1             string
            enable
            log_archive_dest_state_10            string
            enable
            log_archive_dest_state_2             string
            DEFER
            log_archive_dest_state_3             string
            DEFER
            log_archive_dest_state_4             string
            enable
            log_archive_dest_state_5             string
            enable
            log_archive_dest_state_6             string
            enable
            log_archive_dest_state_7             string
            enable
            log_archive_dest_state_8             string
            enable
            log_archive_dest_state_9             string
            enable


            SQL> select net_timeout from v$archive_dest;

            NET_TIMEOUT
            -----------
                      0
                    180
                      0
                      0
                      0
                      0
                      0
                      0
                      0
                      0

            10 rows selected.


            ====================================================================
            Secondary :

            SQL> select name,open_mode,database_role,protection_mode from v$database
              2  ;

            NAME      OPEN_MODE  DATABASE_ROLE    PROTECTION_MODE
            --------- ---------- ---------------- --------------------
            STRM10R   READ WRITE PRIMARY          MAXIMUM PERFORMANCE


            SQL> SELECT dest_name,destination,recovery_mode,PROTECTION_MODE FROM v$archive_dest_status
              2  ;

            DEST_NAME
            --------------------------------------------------------------------------------
            DESTINATION
            --------------------------------------------------------------------------------
            RECOVERY_MODE           PROTECTION_MODE
            ----------------------- --------------------
            LOG_ARCHIVE_DEST_1
            /usr/local/oracle/admin/strm10r/arch
            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_2

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_3

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_4

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_5

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_6

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_7

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_8

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_9

            IDLE                    MAXIMUM PERFORMANCE

            LOG_ARCHIVE_DEST_10

            IDLE                    MAXIMUM PERFORMANCE


            10 rows selected.

            SQL> show parameter log_Archive_dest_;

            NAME                                 TYPE        VALUE
            ------------------------------------ ----------- ------------------------------
            log_archive_dest_1                   string      LOCATION=/usr/local/oracle/adm
                                                             in/strm10r/arch
            log_archive_dest_10                  string
            log_archive_dest_2                   string
            log_archive_dest_3                   string
            log_archive_dest_4                   string
            log_archive_dest_5                   string
            log_archive_dest_6                   string
            log_archive_dest_7                   string
            log_archive_dest_8                   string
            log_archive_dest_9                   string
            log_archive_dest_state_1             string      enable
            log_archive_dest_state_10            string      enable
            log_archive_dest_state_2             string      enable
            log_archive_dest_state_3             string      ENABLE
            log_archive_dest_state_4             string      enable
            log_archive_dest_state_5             string      enable
            log_archive_dest_state_6             string      enable
            log_archive_dest_state_7             string      enable
            log_archive_dest_state_8             string      enable
            log_archive_dest_state_9             string      enable

            SQL> select net_timeout from v$archive_dest;

            NET_TIMEOUT
            -----------
                      0
                      0
                      0
                      0
                      0
                      0
                      0
                      0
                      0
                      0

            10 rows selected.


            ==============================================================================

            NET_TIMEOUT is set to 180 seconds on the primary .

            Note : You might be confused to see both databases showing Primary database role, that is because we are trying to use Dataguard REDO transport methode for shipping the logs but actually wants to build Oracle Streams. So, we did not create Standby database from Primary control file, saying that , I have the same setup working on the different server using Dataguard Redo transport method.


            What it sounds like might be happening is that your primary is submitting a redo log and waiting for a response from the standby. When the primary does not get the response in time, it may be timing out and moving on.


            Exactly, LGWR on the primary  is not able to write / send the REDO to the secondary. As soon as I change LOG_ARCHIVE_DEST to SYNC , I see the below error in LNS log file on the Primary

             

             

            Redo shipping client performing standby login
            *** 2014-01-06 10:14:39.091 68920 kcrr.c
            Logged on to standby successfully
            Client logon and security negotiation successful!
            *** 2014-01-06 10:14:39.091 61296 kcrr.c
            LNSb: connect status = 0
            *** 2014-01-06 10:14:39.111
            ksedmp: internal or fatal error
            ORA-00600: internal error code, arguments: [kcrrnsfwa.15], [4294967104], [512],
            [], [], [], [], []
            ----- Call Stack Trace -----
            calling call entry argument values in hex
            location type point (? means dubious value)
            -------------------- -------- -------------------- ----------------------------
            ksedmp()+728 CALL ksedst() 000000017 ?
            FFFFFFFF7FFFCEBC ?
            000000000 ?

             

            Regarding the network. Know that there is a difference between bandwidth (how fat your pipe is) and latency (how fast the water moves through the pipe). Check for latency issues by utilizing ping, tracert, and tcpdump. It could also be IO issues local to the standby. Check that with a utility like iostat. Easy way is to run the same tests on all servers and compare.
            I see the below warning messages on my LGWR file.  Whereas in my setup which is working fine , I don’t see these warning messages . But according to few posts these warning messages can be ignored. Let me know your thought on this.


            *** SERVICE NAME:() 2014-01-03 18:14:21.348
            *** SESSION ID:(3506.1) 2014-01-03 18:14:21.348
            Maximum redo generation record size = 156160 bytes
            Maximum redo generation change vector size = 150676 bytes
            *** 2014-01-05 20:35:21.143
            Warning: log write time 6870ms, size 89KB
            *** 2014-01-05 20:47:50.385
            Warning: log write time 990ms, size 73KB
            *** 2014-01-05 20:49:12.388
            Warning: log write time 880ms, size 4KB
            *** 2014-01-05 20:49:14.025
            Warning: log write time 1350ms, size 3KB
            *** 2014-01-05 20:49:14.588
            Warning: log write time 560ms, size 25KB
            *** 2014-01-05 20:49:21.051
            Warning: log write time 600ms, size 123KB
            *** 2014-01-05 20:49:21.839
            Warning: log write time 500ms, size 401KB
            *** 2014-01-05 20:49:22.887
            Warning: log write time 590ms, size 1980KB
            *** 2014-01-05 20:49:24.172
            Warning: log write time 1290ms, size 1426KB

             

             

            I also ran few network commands.
            Primary Database
            dbaentn1% iostat
               tty        sd0           sd1           sd2           sd3            cpu
            tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
               0   61 375  23   13    0   0    0    0   0    0    0   0    0    6  3  0 90
            dbaentn1%
            Ping from Primary to Secondary
            dbaentn1% ping -I 5 -v dban2
            PING dban2: 56 data bytes
            64 bytes from dban2XXXXXX icmp_seq=0. time=3.26 ms
            64 bytes from dban2XXXXXX icmp_seq=1. time=3.36 ms

             

            Secondary Database
            dban2% iostat
               tty        sd3           ssd0          ssd1         ssd74           cpu
            tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy wt id
               0   60   0   0    0  210   9   17  211  10   15    0   0    0   12 12  0 76
            dban2%
            Ping from Secondary to Primary

            dban2%  ping -I 5 -v dbaentn1
            PING dbaentn1: 56 data bytes
            64 bytes from dbaentn1XXXXXX   icmp_seq=0. time=3.47 ms
            64 bytes from dbaentn1XXXXXXX : icmp_seq=1. time=3.34 ms


            Thank you very much for your valuable time. Highly appreciate it. Please let me know if any more details you feel needed in this issue.

             

            Thanks,
            Krishna

            • 3. Re: Redo Transport problem with SYNC option
              Anar Godjaev

              HI,

               

              Redo shipping client performing standby login

              *** 2014-01-06 10:14:39.091 68920 kcrr.c

              Logged on to standby successfully

              Client logon and security negotiation successful!

              *** 2014-01-06 10:14:39.091 61296 kcrr.c

              LNSb: connect status = 0

              *** 2014-01-06 10:14:39.111

              ksedmp: internal or fatal error

              ORA-00600: internal error code, arguments: [kcrrnsfwa.15], [4294967104], [512],

              [], [], [], [], []

              ----- Call Stack Trace -----

              calling call entry argument values in hex

              location type point (? means dubious value)

              -------------------- -------- -------------------- ----------------------------

              ksedmp()+728 CALL ksedst() 000000017 ?

              FFFFFFFF7FFFCEBC ?

              000000000 ?

               

               

              In my practice when ORA-00600 error occurs then I directly open Oracle SR. I recommend you to open SR..


              Thank you

              • 4. Re: Redo Transport problem with SYNC option
                896971

                Anar Godjaev wrote:

                 

                In my practice when ORA-00600 error occurs then I directly open Oracle SR. I recommend you to open SR..


                Thank you

                 

                At this point, I must agree with Anar.

                • 5. Re: Redo Transport problem with SYNC option
                  981760

                  Thanks everyone. Will raise an SR with Oracle .

                  • 6. Re: Redo Transport problem with SYNC option
                    Anar Godjaev

                    the right choice