2 Replies Latest reply: Sep 6, 2013 1:20 PM by Bobfinan - Oracle-Oracle RSS

    Server dies and then fails to start !!


      I am not an expert on Tuxedo but have been facing an issued in our environment.

      We have a cron job defined which direct to ud32 , which is basically a file transfer to another system. . Its running every 5 minutes.


      ud32 < /fhs_integration/fhstux/env/am_scripts_prod/kog_send_param >>/var/fhs/log/send_kog.$TIMESTAMP 2>&1



      FFML_VERSION    1

      FFML_RH_SIZE    9

      FFML_RECORDS    0



      Service FHS_TRANS_KOG runs on Server FHS_KOG.


      Now the problem is that the Server FHS_KOG fails to maintain it status and dies after a while


      • the ud32 output shows which is because of as SERVER is now no longer present.-

      CMDTUX_CAT:991: ERROR: Can't send buffer TPENOENT - no entry found

      CMDTUX_CAT:991: ERROR: Can't send buffer TPENOENT - no entry found


      • and in ULOG I only find following information :

      231923.aapfhst1!XML_PROXY.504032.1.0: INFO 0 Service FHS_PUBLISH returned ok, return buffer trace generated

      231923.aapfhst1!XML_PROXY.504032.1.0: INFO 0 Generated XML string (len:2172)

      235529.aapfhst1!BBL.1171696.1.0: LIBTUX_CAT:541: WARN: Server AM_GROUP/60 terminated

      235529.aapfhst1!BBL.1171696.1.0: LIBTUX_CAT:557: INFO: Server AM_GROUP/60 being restarted

      235529.aapfhst1!FHS_KOG.1212492.1.0: 09-05-2013: Tuxedo Version 9.1, 64-bit

      235529.aapfhst1!FHS_KOG.1212492.1.0: LIBTUX_CAT:262: INFO: Standard main starting

      235529.aapfhst1!FHS_KOG.1212492.1.0: LIBTUX_CAT:250: ERROR: tpsvrinit() failed

      235529.aapfhst1!restartsrv.856280.1.-2: 09-05-2013: Tuxedo Version 9.1, 64-bit

      235529.aapfhst1!restartsrv.856280.1.-2: server AM_GROUP/60: CMDTUX_CAT:579: ERROR: Cannot restart a server - unknown process creation error: 23

      235529.aapfhst1!restartsrv.856280.1.-2: server AM_GROUP/60: CMDTUX_CAT:587: INFO: Cannot restart server, scheduling for cleanup

      235529.aapfhst1!cleanupsrv.1880262.1.-2: 09-05-2013: Tuxedo Version 9.1, 64-bit

      235529.aapfhst1!cleanupsrv.1880262.1.-2: server AM_GROUP/60: CMDTUX_CAT:1073: WARN: Client process 1638494 - dropped message because server died, SERVICE=FHS_KOG

        • 1. Re: Server dies and then fails to start !!



          so the problem is that server crashes or that after a crash tuxedo is unable

          to restart it or both? are you able to start it manualy?


          check if server dumps a core (inside: "tmunloadcf|grep APPDIR" directory). if yes

          then it might be helpful to server author


          check if server produces its own log file. perhaps it has ability to create one

          if configured properly?


          if "tpsvrinit() failed" then you might want to analyze server boot with system

          tools (truss on aix, strace/ltrace on linux)


          if server is in resource manager group then check if this start problem

          is caused by database/mq/whatever connect problem


          contact server author

          • 2. Re: Server dies and then fails to start !!
            Bobfinan - Oracle-Oracle


            "error: 23" is an OS error. It is platform dependent but in general I think the errorno is defined as:

            #define     ENFILE 23     /* File table overflow */


            You probably ran out of OS resources when tpsvrinit() tried to do a file open (e.g. maybe initializing a DB connection).

            Either the OS kernel is not configured properly for the application architecture requirements or the resources are not being reclaimed properly.

            Since this is occurring after a server crash I would think it was the second problem. Clean up resources (e.g. ipc, dead processes).

            Figure out why the server crashed. Make sure you have the highest Tuxedo 9.1 rolling patch level  available installed.


            Bob Finan