7 Replies Latest reply on Apr 19, 2012 5:45 AM by Billy~Verreynne

    EROR OGG-01224 TCP/IP error 101 (network is unreachable)

    931360
      i already set the golden gate and its work normally no error for several days..
      but suddenly pump abended n there is an error :

      EROR OGG-01224 TCP/IP error 101 (network is unreachable)
      2012-04-18 10:14:08 EROR OGG-01668 SRCPUMP.PRM : PROCESS ABENDING

      i have already set autorestart at parameter MGR
      so after srcpump abended, for 3 minutes the srcpump restarted automatically

      but there is another error:
      10:17:08 ERROR OGG-01033 Oracle GoldenGate Capture for Oracle, srcpump.prm:
      There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unknown error.
      (Remote file used is /oracle/product/test/dirdat/bb000118, reply received is Unable to lock file "/oracle/product/test/dirdat/bb000118" (error 13, Permission denied).
      Lock currently held by process id (PID) 2980).

      and then srcpump restarted automatically
      and Lock again

      and its looping again and again
      after i wait for about 2 hours then suddenly

      INFO OGG-01057 Oracle GoldenGate Capture for Oracle, srcpump.prm: Recovery completed for all targets.

      But sometimes when looping eror like that, suddenly appear error Network is Unreachable again so its like the looping will repeated again from the beginning.
      Because of that i definitely don't know when the pump will running normally again..

      i really don't undestand why happend like this.. Please help me..

      thank's
        • 1. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
          Herald ten Dam
          Hi,

          welcome.

          Questions about Oracle GoldenGate are better asked in their own forum: GoldenGate

          Looking at the error, I would first ask your network guys if there were any problems in the network. That's what the error message say.

          On Oracle Support there is note "GoldenGate Pump Abended and Can Not be Restarted Due to Locked Remote Trail [ID 1364965.1]", maybe that one can help. It states setting a timeout parameter on the RMTHOST.

          Herald ten Dam
          http://htendam.wordpress.com
          1 person found this helpful
          • 2. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
            Billy~Verreynne
            The "network is unreachable" error is usually exactly what it says it is - a network problem. Assuming of course the IP and subnet used for destination is actually valid and reachable/routable and not a typo error.

            And to troubleshoot a network problem, you need someone with network experience and an understanding of your network. It can be caused by a wide range of issues - from a faulty port on the switch, or a NIC running in half duplex, to an o/s network configuration issue. And the error message provides very little to go on. So for starters, access to that destination IP needs to be tested on the server via CLI (e.g. running a ICMP echo, trace route, dumping the server's routing table to see how the routing to that destination IP is configured, etc).
            1 person found this helpful
            • 3. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
              931360
              Hi Herald and Billy,

              thx for the reponds.
              i try to add parameter TIMEOUT on the RMTHOST.
              but nothing happend and still the same error like before..

              i ask my network guy.. He say it's true that the connection is really not good
              but it just happend for several minutes and the connection back to normal again..

              He say that the LOCK problem not because the connection again, everyone can access the network normally.
              maybe it come from the golden gate.

              But i still don't know why and how golden gate make lock for the connection
              • 4. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
                Billy~Verreynne
                Cannot comment on GoldenGate - which is why you should consider raising the problem in that forum as the issue is not a general database issue.

                As for locking - do not understand what you are trying to explain with that.

                There is no locking as such at TCP level. Multiple processes can open client sockets to the same destination IP and destination (server) port. Likewise, a single process can also open multiple such client sockets to the same server. So one client socket cannot lock/prevent another client socket from using the same server and port to connect to.

                For server ports - only a single process can open a specific port on a specific IP address. The error that will result when that server port is attempted to be opened as a listening end point again, is something like +"port in use+" - not "+network unreachable+".

                The "+network unreachable+" error means exactly that in my experience - the destination subnet cannot be reached.

                And as I've mentioned, there can be many issues that can cause this error to appear and disappear. None of them related to Oracle specifically in my experience.
                • 5. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
                  931360
                  I am a newbie and just try to use Golden gate to replication the database
                  when i am use it suddenly appear error like this :

                  2012-04-19 09:09:19 EROR OGG-01224 TCP/IP error 101 (network is unreachable)
                  2012-04-19 09:09:19 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, srcpump.prm: PROCESS ABENDING.

                  2012-04-19 09:12:10 INFO OGG-00975 Oracle GoldenGate Manager for Oracle, mgr.prm: EXTRACT SRCPUMP starting.
                  2012-04-19 09:12:10 INFO OGG-00965 Oracle GoldenGate Manager for Oracle, mgr.prm: EXTRACT SRCPUMP restarted automatically.
                  2012-04-19 09:12:10 INFO OGG-00992 Oracle GoldenGate Capture for Oracle, srcpump.prm: EXTRACT SRCPUMP starting.
                  2012-04-19 09:12:10 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, srcpump.prm: EXTRACT SRCPUMP started.
                  2012-04-19 09:12:16 INFO OGG-01226 Oracle GoldenGate Capture for Oracle, srcpump.prm: Socket buffer size set to 200000000 (flush size 200000000).
                  2012-04-19 09:12:17 INFO OGG-01055 Oracle GoldenGate Capture for Oracle, srcpump.prm: Recovery initialization completed for target file /oracle/product/test/dirdat/bb000129, at RBA 982889.
                  2012-04-19 09:12:17 INFO OGG-01478 Oracle GoldenGate Capture for Oracle, srcpump.prm: Output file /oracle/product/test/dirdat/bb is using format RELEASE 10.4/11.1.
                  2012-04-19 09:12:17 ERROR OGG-01033 Oracle GoldenGate Capture for Oracle, srcpump.prm: There is a problem in network communication, a remote file problem,encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Remote file used is /oracle/product/test/dirdat/bb000129, reply received is Unable to lock file "/oracle/product/test/dirdat/bb000129" (error 13, Permission denied). Lock currently held by process id (PID) 9000).
                  2012-04-19 09:12:17 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, srcpump.prm: PROCESS ABENDING.

                  i don't know why appear error like this and i have already try to ask at forum golden gate : GoldenGate
                  but nobody answer my question..

                  The network is unreachable for a while then everyone can access normally again
                  but consequently for Golden Gate the extract cannot running again for about 2 hours and the extract always try to restarted automatically but appear error :

                  2012-04-19 09:12:17 ERROR OGG-01033 Oracle GoldenGate Capture for Oracle, srcpump.prm: There is a problem in network communication, a remote file problem,encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Remote file used is /oracle/product/test/dirdat/bb000129, reply received is Unable to lock file "/oracle/product/test/dirdat/bb000129" (error 13, Permission denied). Lock currently held by process id (PID) 9000).
                  2012-04-19 09:12:17 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, srcpump.prm: PROCESS ABENDING.

                  And the error not yet over for about 2 hours then sometimes found :
                  2012-04-19 09:09:19 EROR OGG-01224 TCP/IP error 101 (network is unreachable) again..

                  So the error can't finished after the connection really in a good condition n not found unreachable for a several hours.
                  But i really not sure when exactly the connection in a good condition.
                  • 6. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
                    sb92075
                    anybody or anything reporting any type of networking issues.

                    If the network has a problem, it would just not pick on Golden Gate.
                    • 7. Re: EROR OGG-01224 TCP/IP error 101 (network is unreachable)
                      Billy~Verreynne
                      928357 wrote:

                      The network is unreachable for a while then everyone can access normally again
                      And that is the core problem. And that is unacceptable from a network. That is what needs to be fixed. It is also something that I would consider a very serious problem.

                      It can be a script kiddie doing a DoS attack. It could be a buggy router IOS or config. It could be faulty networking h/w. None of these are simply issues that one can ignore and focus attention instead on GoldenGate that crashed as a result.
                      but consequently for Golden Gate the extract cannot running again for about 2 hours and the extract always try to restarted automatically but appear error :
                      2012-04-19 09:12:17 ERROR OGG-01033 Oracle GoldenGate Capture for Oracle, srcpump.prm: There is a problem in network communication, a remote file problem,encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Remote file used is /oracle/product/test/dirdat/bb000129, reply received is Unable to lock file "/oracle/product/test/dirdat/bb000129" (error 13, Permission denied). Lock currently held by process id (PID) 9000).
                      That sounds like a problem with the 1st run crashing and leaving behind orphaned processes and dirty resources. Who and what was the process 9000 when the above error occurred when GoldenGate restarted? What handles (files and sockets) do it own? (the lsof command will be helpful). Does it own any network ports as listener end-points? (this can also prevent GoldenGate from restarting)

                      I have never even seen GoldenGate, so cannot comment on it specifically - but I would expect a high-end replicator system to have a proper cleanup when it crash. However, that is not always 100% guaranteed despite the best efforts of the s/w. Even an Oracle instance crash can leave orphaned processes and even the SGA behind.

                      In Oracle's case, a left-over SGA will prevent that instance from restarting - and one needs to manually and forcible removed that shared memory area from server memory, prior to restarting the instance.

                      Your GoldenGate crash could be a similar (worse of the worse) situation - requiring manual intervention to clear resource and processes, prior to restarting the server s/w.

                      Of course - such crashes should be the exception. And in your case it seems to be the rule, due to whatever is happening on your network. Which is why GoldenGate issues are the symptoms. The problem is your network and that is what needs to be fixed.