Tuxedo 11g (220.127.116.11.0) / Windows 2008 R2
We have a number of third party supplied tuxedo servers which when the BBL kills and restarts them the following is noted in the ULOG.
BBL.1168.8884.0: LIBTUX_CAT:541: WARN: Server ThirdParty_Grp/630 terminated
I understand that this warning is given when the code exited a process without calling any Tuxedo exit routines.
The memory on these same tuxedo servers grow quite large throughout the day and I was wondering if the poor practice of not calling Tuxedo exit routines are the root cause of the memory leaks.
If so, what evidence/proof/test is there that we can use to ask the third party to correct the code?
Edited by: cool.br33ze on 20-Mar-2013 04:43
are you sure that the BBL actually "kills" them, or do they actually crash on their own and BBL just discovers that they're gone (and restarts them)?
Are there any core files generated? Sometimes the Tuxedo server is unable to create a core file when it crashes due to directory permissions or file size quotas, but if you find a core file and "file core" tells you it was from one of the supplied servers I'd say you have "the smoking gun" right there.
Hope this helps,
To add to what Per said, if the BBL kills a server (which it normally only does in the case of a server exceeded service timeout value SVCTIMEOUT) then there should be an entry in the ULOG to that effect. I suspect as Per says, these servers likely have severe memory leaks and possibly memory corruption (although that is speculation on my part) problems. Memory growing in a Tuxedo server is often related to not performing tpfree() calls on buffers the server owns. But without sources, the only option is likely TMTRACE as was already mentioned.
Oracle Tuxedo Chief Architect
Thanks for the responses so far.
As these are third party services I dont have access to the code for me to see myself.
Yes, BBL is killing the server after a SVCTIMEOUT, the following is in the ULOG.
BBL.1168.8884.0: CMDTUX_CAT:1836: WARN: Server(7876) processing terminated with SIGKILL after SVCTIMEOUT
Straight after the LIBTUX_CAT:541 is logged.
There are multiple offending services which tells me it isn't in one service where the process exited without calling any Tuxedo exit routines... (Are these just tpreturn() and tpexit()? It's been a while since i have written a tux service. )
...Or where the memory leaks are.
Will the TMTRACE show tpalloc() along with (if any) tpfree()?
you can definitely search for one specific ATMI call, see
for details regarding TMTRACE where there's an example of logging all calls to tpacall().
As I'm not The Universal Master Of All RegExps I don't really know whether you can catch several different calls in the search expression.
On another note, I'd try looking into the processing time of your services. I'd say it's more probable that you have a performance problem in general rather than a major bug in the services (if they never call tpreturn() not much work will be done at all...).
What is the value for SVCTIMEOUT (that is trespassed every now and then)?
If you can add a "-r" in CLOPT you'll get statistics from all services executed in that particular server written to the stderr file (that might be specified with the -e option).
Using txrpt you can then get insight into the execution times for the services in question. If they are constantly near the SVCTIMEOUT you may need to adjust the SVCTIMEOUT. If they are varying very much you may need to check for locks in the database or other reasons for "spiky" behaviour.
for more info on what can be done in CLOPT in the ubbconfig (look for -r and -e options), and also
for more info on how to interpret the statistics that -r creates.
It is fairly common to create own-written utilities for interpreting the stderr file, it's not really rocket science if you want to get for instance min and max values out of it, which, by the way, would make a nice enhancement to txrpt in the first place. Product Management: are you listening? :-)
Hope this helps,
Well if all of the LIBTUX_CAT:541 messages are preceded by a CMDTUX_CAT:1836, then it would make sense that exit handlers aren't being called because the process is being killed with SIGKILL which as far as I know can't be caught, so all user exit handlers are bypassed. But what does that have to do with leaking memory? The memory leaks are most certainly not related to exit handlers not being called, as that would be the last thing a server does in any case before the process disappears.
You would want to trace service routines, tpreturn(), tpalloc(), tpfree(), and tpcall()/tpacall(). There is no tpexit() routine to trace.
Oracle Tuxedo Chief Architect