Tuxedo stalled on semop — oracle-tech

    Forum Stats

  • 3,715,753 Users
  • 2,242,856 Discussions
  • 7,845,550 Comments

Discussions

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Tuxedo stalled on semop

user479539
user479539 Member Posts: 3
edited September 2018 in Tuxedo

We have a billing system using Oracle Tuxedo. We have setup a daily processing schedule where event files are being processed, rated and charges generated. The billing system is using Tuxedo as middleware.

This is all probably not very relevant to my question.

Sometime when the system is under heavier load, we experience a process blocking on a semop (trying to decrease the value of a Tuxedo semaphore which is already 0). And this makes it wait indefinitely.

Example strace:

strace -p 73368
strace: Process 73368 attached

semop(2001403960, [{74, -1, 0}], 1

So the process is waiting on the semaphore with id 74 in the semaphore array 2001403960 to become >0 so it can decrease the value.

However the bbsread command executed in tmadmin shows this (scroll to semaphore 74):

echo bbsread | tmadmin

IPC resources for the bulletin board on machine site1:

SHARED MEMORY:          Key: 0xd995

SEGMENT 0:

                         ID: 1616609373

                       Size: 3684012

         Attached processes: 73

      Last attach/detach by: 76371

This semaphore is the system semaphore

SEMAPHORE:              Key: 0xd995

                         Id: 2001338423

       | semaphore  | current |   last    | # waiting |

       |   number   | status  |  accesser | processes |

       |----------------------------------------------|

       |      0     |   free  |     76371 |     0     |

       |      1     |   free  |         0 |     0     |

       |      2     |   free  |     57923 |     0     |

       |      3     |   free  |     57923 |     0     |

       |      4     | locked  |     57923 |     0     |

       |------------|---------|-----------|-----------|

This semaphore set is part of the user-level semaphore

SEMAPHORE:              Key: IPC_PRIVATE

                         Id: 2001403960

       | semaphore  | current |   last    | # waiting |

       |   number   | status  |  accesser | processes |

       |----------------------------------------------|

       |      0     | locked  |         0 |     0     |

       |      1     | locked  |         0 |     0     |

       |      2     | locked  |         0 |     0     |

       |      3     | locked  |         0 |     0     |

       |      4     | locked  |         0 |     0     |

       |      5     | locked  |         0 |     0     |

       |      6     | locked  |     58053 |     0     |

       |      7     | locked  |     58054 |     0     |

       |      8     | locked  |     58083 |     0     |

       |      9     | locked  |     58084 |     0     |

       |     10     | locked  |         0 |     0     |

       |     11     | locked  |     58082 |     0     |

       |     12     | locked  |         0 |     0     |

       |     13     | locked  |     58085 |     0     |

       |     14     | locked  |     58088 |     0     |

       |     15     | locked  |     58087 |     0     |

       |     16     | locked  |         0 |     0     |

       |     17     | locked  |         0 |     0     |

       |     18     | locked  |         0 |     0     |

       |     19     | locked  |         0 |     0     |

       |     20     | locked  |         0 |     0     |

       |     21     | locked  |         0 |     0     |

       |     22     | locked  |         0 |     0     |

       |     23     | locked  |         0 |     0     |

       |     24     | locked  |         0 |     0     |

       |     25     | locked  |     58198 |     0     |

       |     26     | locked  |     58201 |     0     |

       |     27     | locked  |         0 |     0     |

       |     28     | locked  |     58203 |     0     |

       |     29     | locked  |     58205 |     0     |

       |     30     | locked  |         0 |     0     |

       |     31     | locked  |     58294 |     0     |

       |     32     | locked  |     58302 |     0     |

       |     33     | locked  |         0 |     0     |

       |     34     | locked  |         0 |     0     |

       |     35     | locked  |         0 |     0     |

       |     36     | locked  |         0 |     0     |

       |     37     | locked  |     58311 |     0     |

       |     38     | locked  |         0 |     0     |

       |     39     | locked  |         0 |     0     |

       |     40     | locked  |         0 |     0     |

       |     41     | locked  |         0 |     0     |

       |     42     | locked  |         0 |     0     |

       |     43     | locked  |         0 |     0     |

       |     44     | locked  |         0 |     0     |

       |     45     | locked  |         0 |     0     |

       |     46     | locked  |         0 |     0     |

       |     47     | locked  |         0 |     0     |

       |     48     | locked  |         0 |     0     |

       |     49     | locked  |         0 |     0     |

       |     50     | locked  |         0 |     0     |

       |     51     | locked  |         0 |     0     |

       |     52     | locked  |         0 |     0     |

       |     53     | locked  |         0 |     0     |

       |     54     | locked  |         0 |     0     |

       |     55     | locked  |         0 |     0     |

       |     56     | locked  |         0 |     0     |

       |     57     | locked  |         0 |     0     |

       |     58     | locked  |         0 |     0     |

       |     59     | locked  |         0 |     0     |

       |     60     | locked  |         0 |     0     |

       |     61     | locked  |         0 |     0     |

       |     62     | locked  |         0 |     0     |

       |     63     | locked  |         0 |     0     |

       |     64     | locked  |         0 |     0     |

       |     65     | locked  |         0 |     0     |

       |     66     | locked  |         0 |     0     |

       |     67     | locked  |         0 |     0     |

       |     68     | locked  |         0 |     0     |

       |     69     | locked  |         0 |     0     |

       |     70     | locked  |         0 |     0     |

       |     71     | locked  |         0 |     0     |

       |     72     | locked  |         0 |     0     |

       |     73     | locked  |     60179 |     0     |

       |     74     | locked  |         0 |     1     |

       |    Process dead! Semaphore possibly stuck!   |

       |     75     | locked  |         0 |     0     |

       |     76     | locked  |         0 |     0     |

...

tmadmin correctly says that semaphore is stuck as some process is dead.

My problem is to analyse how it came to this and which process is dead. Tuxedo doesn't offer utilities that show this info.

There is no errors in the ULOG nor any process has died according to our logs.


I am looking for suggestions of how to get more info.

tmadmin -v

INFO: Oracle Tuxedo, Version 12.1.1.0, 64-bit, Patch Level 080

uname -a

Linux 3.10.0-862.3.2.el7.x86_64 #1 SMP Tue May 15 18:22:15 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

Answers

  • Bartek Gasparski
    Bartek Gasparski Member Posts: 9
    edited September 2018

    When this situation happen try:

    ipcs -s -p

    then You can match this semaphore with PID

    You can also try find in ULOG some warrning, like  1511 from libtux and others (but warrnings not exactly errors).

    You can turn on trace

    in ubb:

    TMTRACE="*:ulog:dye"

    or by tmadmin

    User_HFFB1
This discussion has been closed.