I have recently been tasked with taking over the GG implementation at my company and am hoping there is a simple answer for my question.
I am concerned we are not following practice and writing directly to a remote trail file but I am not sure how to prove this.
What is the "give away" or how can you tell if you are writing to a remote file directly or local trail file?
This is on an 22.214.171.124 Oracle database.
Find your group name, aka Extract, Pump, or Replicat name. Run this: GGSCI> info all
Look at your parameter files for the process in question: GGSCI> view params groupname
For a local trail, you will see something like this: EXTTRAIL ./dirdat/ET
The EXTTRAIL parameter indicates that it is local and writing to the dirdat directory in the OGG installation folder.
For a remote trail file, it will look something like this: RMTTRAIL ./dirdat/et. That means that the process (most commonly just the Pump) is writing to a remote server based on the RMTHOST value and to that server's OGG dirdat installation folder.
Let us know if that answers it for you.
Extract e001 UserID user1, password fakepass RmtHost testenviron.testnet.com, MgrPort 2000 RmtTrail ./dirdat/01 GetUpdatesbefore TranLogOptions ExcludeUser user1 ...
This is what my params come up with which is what made me curious.
I feel like this is saying to the extract process to write directly to the remote trail on the remote host.
I assumed a local trail would follow the process of:
write to a local host
DataDump would hold all the correct information to actual transfer the data. AKA: a local extract doesn't even know anything about a remote host.
Am I looking at this correctly?
You are correct. For a production configuration it's suggested to have the Extract write a set of trails to the local host first, with the EXTTRAIL parameter, then create another Extract process (called a Pump, not to be confused with Oracle's DB Data Pump) that will read from that local trail file and write another copy on the remote host via the RMTTRAIL and RMTHOST parameters.
What you have now is fine for testing, but in a Production environment, if the connection were to be lost between the Extract and the target server, the Extract would Abend and have much more work to do until the network connection re-establishes, so that's why it's best to put a Pump in the mix to let it negotiate the network breaks should there be any.
If you are sending to a remote host, you will want to specify an absolute path as opposed to a relative path as you can never go wrong with that. What does ./some_directory mean compared to /u01/ogg/my_trails? You get the idea the latter is much more informative than ./some_directory.
As mb_ogg said, you definitely want primary extract -- secondary extract (which is the data pump extract) -- replicat type of setup. The whole point of using GoldenGate is to avoid single points of failure, and going straight from extract to replicat introduces the possibility of a network error hosing your synchronization.
The reason we've chosen to do replication via remote trails is because the databases are located on the same box.
I understand the box can't go down and 1 of the databases stays up.
What risks do we run though if the box stays up but one of the remote databases we are replicating to goes down? Does it just fill up in memory on the source side and then spill over in to disk?
Sorry, but I'm a little confused by your configuration. Can you list in detail your environment, databases, and OGG setup (1 installation or several for example). You said the databases are on the same box but then say 'one of the remote databases', so I'm just a little confused.
Ah yes sorry. I am also confused but will do my best to explain.
We have 1 box (box1) and 2 databases on it (DB1 and DB2). same Oracle versions 11gr2. I believe we have just 1 GG install.
In the GG param files, we are writing directly to the "remote" trail file. In this case i was being confused as "remote" is actually the target database but is physically located on the same box.
No use of datapump or across a network to another box.
So my question is what risks, if any, do we run since we are writing directly to a "remote" trail file directly when the remote database is on the same physical box as the source.
Ok, thanks for the update.
First, since everything is local there's still no need for a RMTTRAIL and in fact you could introduce issues for the Extract because it still has to connect to the localhost port that Manager is running on, and that also spawns a Server Collector to handle the Extract and trail creation. So in your case it would be best I think to use an EXTTRAIL and eliminate an extra process and network call. You can then setup 1 to n Replicats to read from that local trail.
add extract exttest, tranlog, begin now
add exttrail ./dirdat/et, extract exttest
add replicat reptest, exttrail./dirdat/et, checkpointtable ...
And the Extract would have this "EXTTRAIL ./dirdat/et" instead of this "RMTHOST... and RMTTRAIL ..."
Since the Replicat is running in the same OGG installation as the Extract, it will read from the same local trail that the Extract is writing. You could also use fully qualified paths for both the Extract and Replicat, such as /usr/app/goldengate/dirdat/et instead of just ./dirdat/et.
And for the question if the target database goes down, then only the Replicat will Abend. You can include autorestart logic in the Manager parameter file using AUTORESTART * for example (there's more options in the docs for that parameter too).
However, in your current configuration with using the remote trail designation for the Extract, you have introduced some extra potential for an outage other than the source database going offline, that would cause the Extract to Abend until those conditions have been cleared.
Does that make sense?
I think it makes sense. I don't quite understand the extra network call. isn't the difference there just:
Should Datapump make the network call or should the extract make it directly?
I am with agree that we should have the datapump in place but I think it has already been setup and will continue this way. I think we will later run in to flexibility issues.
DB1 talks to DB2 on box 1 with remote trail files. Works great.
Now we want DB1 to talk to DB4 on box 2. suddenly we are in trouble since DB4 on box 2 can't just have the datapump replicated easily. We have to create 2 extracts.
Thank you for taking the time to explain so clearly. I am just beginning but am taking the approach to try to learn as much as I can before diving in to test and prod!
The network call is due to the fact that you are using a remote trail versus local trai for the Extract. The parameters RMTHOST and RMTTRAIL are telling the Extract to connect to that host (even though it's local) over the port specfied by MGRPORT. This in turn causes the Manager to take the call over the port and then pass it off to a process called a Server Collector. If you just used the parameter EXTTRAIL in the Extract and created the trail as an EXTTRAIL rather than a RMTTRAIL, then there wouldn't be any network for the Extract to negotiate, it would just write the trail out locally on the server (even though it's also doing that with the remote trail since you've got the RMTHOST as the local server anyway).
A Pump (not the same as Oracle Datapump) would normally handle this if the target db server were different than the source.
And even if you do add DB4 to the mix on Box2, you can also add a Pump process that reads the same trail that the Replicat is reading and you wouldn't need another Extract.
I'll map it out as if you have this setup:
Box1 has DB1 and DB2
Box2 has DB4
Extract1 for DB1 on Box1 writes to Exttrail ./dirdat/et
Replicat2 for DB2 on Box1 reads from Exttrail ./dirdat/et and delivers transactions to DB2
Pump4 for Box2/DB4 on Box1 reads from Exttrail ./dirdat/et and writes to Box2 to Remote Trail ./dirdat/rt
Replicat4 for DB4 on Box2 reads from Exttrail ./dirdat/rt and delivers transactions to DB4
With a configuration like this, you can always add more Pumps on Box1 to read from Extract1's local Exttrail ./dirdat/et and send those to other target servers. In fact, Pump4 can also write to mulitple target servers at once, although it would be best to have 1 Pump write to 1 target server, again to reduce Abends due to the network.