1 2 Previous Next 15 Replies Latest reply: Mar 3, 2014 11:03 AM by Todd Little-Oracle RSS

    To improve userlog TRACE use

    Faith Rolfsen

      I have been trying to identify flags which we can use to better capture details when we have TPESYSTEM errors. We typically use atmi and/or mssg, however, we are trying to capture more detail, describing how far along our outgoing message has proceded before it fails. I have see some discussion of similar logs but have not found how to generate them.  Please advise.

       

      Thank you much.

      Faith R.

        • 1. Re: To improve userlog TRACE use
          Bobfinan - Oracle-Oracle

          Hello Faith,

          The simplest way is to use the TMTRACE environment variable(e.g. export TMTRACE="*:ulog:dye:"). Shutdown Tuxedo, set TMTRACE, boot Tuxedo.  The servers will pick up the environment variable on boot up and add trace messages to the ulog.

          http://docs.oracle.com/cd/E35855_01/tuxedo/docs12c/rf5/rf5.html#wp1529614

           

          You can play around with the booting sequence to make it more selective if you wish. Otherwise you can try the tmadmin

          changetrace command to turn tracing on/off selectively.

          http://docs.oracle.com/cd/E35855_01/tuxedo/docs12c/rfcm/rfcmd.html#wp1971834

           

          Regards,

          Bob Finan

          • 2. Re: To improve userlog TRACE use
            Todd Little-Oracle

            Hi Faith!

             

            If you come across situations where you get TPESYSTEM errors and there is no additional information in the ULOG by default, please let us know.  We're trying to clean those up as tracking down TPESYSTEM errors can be difficult.  Our goal is that anywhere TPESYSTEM is generated causes a ULOG entry explaining why TPESYSTEM was being returned.

             

            So are you asking which trace categories to enable for TMTRACE or something else?  "atmi" for the trace category, "ulog" for the receiver, and ask for dyeing in order to correlate requests. Or were you asking for something else?

             

            How often are you getting TPESYSTEM errors and from what call are they being reported?

             

            Regards,

            Todd Little

            Oracle Tuxedo Chief Architect

             

            PS  Congratulations on surviving the polar plunge!

            • 3. Re: To improve userlog TRACE use
              Bobfinan - Oracle-Oracle

              Hi Faith,

              Another option is an environment variable, TM_LOG_ESYS, that was added in Tuxedo 9.1.

               

              http://docs.oracle.com/cd/E35855_01/tuxedo/docs12c/rf5/rf5.html#wp1027414

              This environment variable is used to add detailed information to ULOG when TPSYSTEM error occurs. TM_LOG_ESYS is limited to ATMI calls and can be set as follow:

              export TM_LOG_ESYS=all

              (other settings are mentioned on the link if you are interested)

               

              It covers most of Tuxedo (except for JRAD and JRLY I think)

               

              Regards,

              Bob

              • 4. Re: To improve userlog TRACE use
                Faith Rolfsen

                Bob and Todd

                Thanks for your follow up. Your details have been helpful.

                (Bob) I will look further into this last reference which you have given to me although I am confident this is not in the atmi logging. Concerning TRACE setting, I have seen many references to GWWS_TRACE_LEVEL but no explanation. If there is a link that tell me how to use this, that would be great.

                (Todd) These errors are from tpcall() and we have not seen anything in our logs from these errors but as you will see in my notes below, we have often had to filter out TRACE messages so we may have trashed ourselves. We are hoping to give you more complete info after what we complete our plans described below.

                 

                While trying to start a discussion on this topic, I was also trying to work with our OPS to find alternatives in testing to get better detail. We came up with a partial solution, so that we can better debug.

                 

                Firstly, our greatest issue has been that we have been trying reduce TRACE messages due to the large number of messages which were are already capturing. The overhead is great when we cannot replicate the issue at will and we are forced to sit and wait. Therefore, we often had indicators but not a complete ULOG excerpt.

                 

                To solve this, we have decided to work with another bulletin board (GWWS server only) which is dedicated to this issue, spilling out the TRACE messages separately until we catch this error. Because we are changing variables, we may no longer see these errors but we are hopeful. "Information drives good business" so before I trouble you further, let me get you more specific info.

                 

                With great regards.

                 

                Faith Rolfsen

                "The water is fine. Come on in!"

                • 5. Re: To improve userlog TRACE use
                  Todd Little-Oracle

                  Hi Faith,

                   

                  OK, so let me understand what you are doing and where you are getting these errors.  If I understood you correctly, you are making outbound web service calls via the GWWS and in some cases you are getting TPESYSTEM error returns.  Is that correct?  If so, you might want to eliminate any TMTRACE tracing that you are currently doing and enable the environment variable Bob mentioned to get more details about why TPESYSTEM is being returned.

                   

                  As for GWWS_TRACE_LEVEL, as far as I can tell with a quick look at the code values of 1,10,100 seem to make sense.  The higher the value, the more tracing detail you will get from the SALT gateway.

                   

                  Regards,

                  Todd Little

                  Oracle Tuxedo Chief Architect

                   

                  PS  The water looked pretty cold!

                  • 6. Re: To improve userlog TRACE use
                    Faith Rolfsen

                    Dear Friends

                     

                    I finally have some results from ULOGs but they still baffle me. The cause of the TPESYSTEM seems to be an OBFail error. There is not much immediate detail which I can see for this particular call but I see that the response to the preceding outgoing webservice was odd in each case. While the preceding call completed and returned successfully, the logs do not show a complete response. It appears as if memory may have been corrupted. This happens in each of the three case we have found and it appears to be related to the TPESYSTEM error.

                     

                    As before, this may be isolated to our use and I am searching for leaks, however, it seems odd that we did not get an error back when the soap response "appears" bad.

                     

                    A modified snippet of what the logs shows for the previous response is below

                    <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" >

                      <SOAP-ENV:Body>

                        <AInquiryOperationResponse>

                          <AInquiry>

                            <AInquiryRequest>

                              .....

                            </AInquiryRequest>

                            <AInquiryResponse>

                              .....

                            </AInquiryResponse>

                          </AInquiry>

                        </OperationResponse>

                      </SOAP-ENV:Body>

                    </SOAP-ENV:EnvelopeA/CardI

                    Thank you.

                    Faith Rolfsen

                     

                    PS - next time, maybe we will have to see what :Lake Michigan has to offer.

                    • 7. Re: To improve userlog TRACE use
                      Faith Rolfsen

                      Bob and Todd:

                       

                      I have now identified the problem. This is memory related and it is also clear to me that there was no issue with the handling of the salt preceding message. You have both helped me greatly.

                       

                      Thanks to our Tuxedo and SALT heroes.

                       

                      With Regards,

                      Faith Rolfsen

                      • 8. Re: To improve userlog TRACE use
                        Todd Little-Oracle

                        Hi Faith,

                         

                        Just curious, could you expand on what sort of memory related problems you were encountering?  It might help someone else in the future in a similar situation.

                         

                        Regards,

                        Todd little

                        Oracle Tuxedo Chief Architect

                         

                        PS  At the moment it has lots of ice to offer!

                        • 9. Re: To improve userlog TRACE use
                          Faith Rolfsen

                          Sure Todd

                           

                          I have been reading a lot of logs and xml to find patterns and I have found a few currious things. First of all, I misspoke earlier when I spoke of not having received any response back. I misinterpretted the flow of the logs.  However, I had referred to bad xml in a response. This is still the case.

                           

                          I have determined that our processing has followed this basic pattern:

                          1) basic salt request and response of webservice request_1 - completion is successful

                          2) subsequent requests and responses - also, all successful

                          3) one more request is made for of webservice request_x

                          4) the response to webservice request_x looks mostly good, however, the </Envelope> tag is corrupted so that it seems that ParseBodyElement is failing.

                          5) upon folow-up, I have found that a piece of the garbage which corrupted this last response came from the response to request_1.

                          - Please see the brief example (above) of xml like we are seeing.

                           

                          Whether this is coming from our webservice or internally, we are digging into this now. My reference to memory was based upon the fact that I had identified exact locations in preceeding xml where this garbage is coming from so it seems as if something is not being properly initialized "somewhere".

                           

                          Thanks for keeping me honest and I hope this clears things up about our status.

                           

                          Faith Rolfsen

                          PS. If you know of a polar event in your area, I may have to see about driving up. I have a bad ankle which could use a little ice

                          • 10. Re: To improve userlog TRACE use
                            Faith Rolfsen

                            Todd

                             

                            To further help us in our analysis, I have a new flavor of this same question.

                            When a webservice response gives us a ulog trace message of

                            "HTTP Parsing: Buffer length 367 --"
                            is the 367 length a calulated length or is it a length received with the xml, in a wrapper from somewhere. I ask that because of the following.

                             

                            Previously, I have described what appears to be corrupted webservice responses which we have received. These corrupted reponses fall in three different categories (based upon the xml string logged in ULOG TRACE).

                            1 - where there is something which appears to have overwritten the end of the response xml;

                            2 - where there has been a truncation of the response xml; or

                            3 - where there has been something appended after the response xml

                            However, when I look at the "Buffer length", this length does not match the length of the string which was logged. In each case, it appears that the Buffer length is actually the correct length for what would have been the correct xml. Therefore, if this is being calculated, I am trying to find out where this was calculated. That would be the last place where the correct xml was handled.

                             

                            (side note - categories 1 & 2 fail with OBFail due to bad parsing. 3 seems to work, ignoring the garbage because it is beyond the Buffer length of the actual xml)

                             

                            We are still trying to trace this data but this detail can help greatly.

                             

                            With Regards

                            Faith Rolfsen

                             

                            - additional curiousity: I am seeing extra references to

                                FSM State Transition: --WAKEUP-->EnTuxedoQ

                                FSM State Transition: --WAKEUP-->ReadyDeletion

                                FSM State Transition: --OK-->Delete

                            These appear to be outside of the usual activity. Is this some form of garbage collection similar to javas?

                            • 11. Re: To improve userlog TRACE use
                              Faith Rolfsen

                              Todd

                              We have verified that the incoming stream is complete, and does not match what the ULOG is showing us. Do you have any advise for a next step?

                               

                              With Regards

                              Faith Rolfsen

                              • 12. Re: To improve userlog TRACE use
                                Faith Rolfsen

                                Todd-

                                A question from one of our leads (Babu Khan): Is there an API which we can use which can give us the soap message which is being returned?


                                Thank you for your help.

                                Faith R. And Babu K.

                                • 13. Re: To improve userlog TRACE use
                                  Todd Little-Oracle

                                  Hi Faith,

                                   

                                  I don't think there is any API to do as you ask.  What I can suggest and would ask that you do is to capture a wire trace using a tool such as Wireshark or something similar.  If you could capture the SOAP request and SOAP response and send them to us that would help immensely.  Or if you have any kind of simple reproducer, that would also help.

                                   

                                  Regards,

                                  Todd Little

                                  Oracle Tuxedo Chief Architect

                                  • 14. Re: To improve userlog TRACE use
                                    Faith Rolfsen

                                    Todd-

                                    We could upload the response by openning a service request in the oracle support site. Do you have access to that if we provide a service request number?

                                     

                                    Faith R.

                                    1 2 Previous Next