2 Replies Latest reply: May 23, 2008 5:07 PM by 633637 RSS

    understanding eload errors for troubleshooting

    633637
      I created a session that is running 5 short scripts. The total VU's for all scripts combined is 96. Set to ramp up by 20% every 2 minutes, with iteration delay of 20 seconds. Generally the test runs great even when fully ramped up. After maybe 5 minutes I'll start to see errors which will clear up after a couple minutes and may randomly reappear after several more minutes.
      Below are the errors we see and the general pattern they follow.

      96 VU's are running fine, several minutes pass, maybe 5-20 VU's get an "Internet error 12002 the request has timed out". That is often followed by "504 Server Error code: Gateway Timeout " error. After a couple minutes those errors will clear up and the VU's will continue to chug away further iterations.

      The web application server monitoring looks fine. We are trying to narrow down where the issue is coming from. (appserver? firewall? database? the eload test machine? We are currently bypassing our proxy server to rule it out as a bottleneck) Does anyone have suggestions? Could it be the eload machine itself? Would putting a network sniffer/monitor on between the eload machine and the network be helpful - if so what do you recommend we watch?

      Thanks.
        • 1. Re: understanding eload errors for troubleshooting
          633637
          I think this means you have found a problem with either the web application. This web site explains why a 504 might occur:
          http://pcsupport.about.com/od/findbyerrormessage/a/504error.htm

          It might be that your web application is making requests to other app servers that might be running slow. Try adding some ServerStats for all the machines that are involved in your web application. You might be able to correlate some of the errors with performance problems on those servers.

          -GateCity_QA
          • 2. Re: understanding eload errors for troubleshooting
            633637
            Thanks for the link. Like you, I also suspect it is a problem on either the web application server, or some network piece in between. So it's good we found it. But now we need to know where it is occuring. So now I have a couple more questions I hope you or someone can answer.

            I want to rule out my workstation (the one running eload). Has anyone recieved a 504 and had the problem be with the eload server itself and not a problem with the actual application being tested or the app server it is running on?

            Does anyone have tips on how they narrowed the 504 down from here? Even though the problem is likely on one of the servers or something in the network - and isn't a problem specific to Empirix - I suspect others have also detected problems in their infrastructure when running eload. Any experiences or suggestions to help pinpoint where? We had performance monitoring on the web app servers. I found an article on this site recommending what to monitor - and I've passed it on to the web app server admins to reference before our next attempt.
            Thanks again!