5 Replies Latest reply on Mar 2, 2012 2:22 AM by Dude!

    Need to understand how system time is computed by mpstat

    Ryszard Styczynski-Oracle
      Hi!

      i've got difficulties analyzing system utilization. Linux utilities (vmstat, iosat, mpstat) report system utilization on level of 3-6%, but when it's 6% system slows down in a visible way. Generally speaking spikes up to 6% are visible in application response times. It's linux 2.16.18 with Xeon X5677 (8 cores+HT). Linux reports 16 logical processors. It's interesting that sum of %sys for each CPU is always <100%.

      How %sys is computed by mpstat? Does it mean that 6% is a this system boundary?

      Thanks,
      Ryszard
        • 1. Re: Need to understand how system time is computed by mpstat
          Dude!
          Performance issues can be complex. Not everything is CPU related. The tools show statistics, which as far as I know will not tell you if your server is properly configured or whether or not your system or application is working efficiently. A single process or thread, for instance, will only run on one CPU, so if you your application is not designed to split tasks and allow the system to use multiple processors, the number CPU's in the system is meaningless.

          According to the man page: %sys shows the percentage of CPU utilization that occurred while executing at the system level (kernel). It does not include time spent servicing interrupts or softirqs - if that is what you mean by "system boundary".
          • 2. Re: Need to understand how system time is computed by mpstat
            Ryszard Styczynski-Oracle
            Hi Dude!

            I agree. You are right in each point.

            But my exact question is about %sys calculation by mpstat. I was probably not very precise with term of "system boundary". My second question was about maximum CPU time that kernel may use. I've discovered very interesting thing related to the subject: sum of %sys column for each CPU (as reported by mpstat) is always below 100%. It was measured during quite heavy load test of the system.

            Take a look at below mpstat output:

            CPU     %user     %nice     %sys     %iowait     %irq     %soft     %steal     %idle     intr/s
            all     48.88     0     4.66     0.04     0.05     0.44     0     45.93     8904
            0     24     0     1     0     0     0     0     75     1000.2
            1     28.8     0     7.8     0.2     0     0     0     63.2     0.4
            2     25.95     0     13.77     0.2     0     0.2     0     59.88     0.2
            3     34.4     0     1     0     0     0     0     64.6     0
            4     56.8     0     3.2     0     0     1     0     39     800.2
            5     53.2     0     2.2     0     0.2     0.6     0     43.8     571.4
            6     51.6     0     1.8     0     0.2     0.4     0     46     211.2
            7     35.2     0     3.6     0     0     0.2     0     61     0
            8     82.24     0     16.77     0     0     0.2     0     0.8     0
            9     30.2     0     11.8     0     0     0     0     58     0.4
            10     65.2     0     0.4     0     0     0     0     34.4     0.2
            11     35.53     0     0.8     0     0     0.2     0     63.47     0.4
            12     65.73     0     3.41     0     0     0.6     0     30.26     539.6
            13     60.8     0     2.6     0     0     1.2     0     35.4     1228.4
            14     58.2     0     2.2     0     0     1.4     0     38.2     1581.2
            15     74.4     0     2.2     0     0.2     1     0     22.2     2970

            Let's take transposed %sys column:

            %sys     4.66     1     7.8     13.77     1     3.2     2.2     1.8     3.6     16.77     11.8     0.4     0.8     3.41     2.6     2.2     2.2

            all=4.66%
            sum 0.15 = 1+7.8+13.77+...+2.2+2.2 = 74.55%

            My bet is that 74.55% is more realistic than 4.66%. How this 4.66% is calculated? What does it mean? And why sum(%sys, 0..15) is always below 100%?

            Thanks,
            Ryszard
            • 3. Re: Need to understand how system time is computed by mpstat
              Dude!
              I can only guess, but if you take 74.55 and divide it by 16 you get 4.66. So that output suggest that "all" means all in terms of average per CPU. Why the sum is always below 100? The average per CPU cannot be higher than 100. Your performance bottleneck could be elsewhere, i.e. I/O or memory/paging or hardware related. %sys only shows system kernel. You will also have to take %user (application level) into account. If you sum %user you sure get over 100 % total.

              Edited by: Dude on Mar 1, 2012 3:54 PM
              • 4. Re: Need to understand how system time is computed by mpstat
                Ryszard Styczynski-Oracle
                Dear Dude!

                thanks for good observations and very good discussion!

                It's always about simple math. 4.66*16=74.56 Well spotted! But I cannot agree with The average per CPU cannot be higher than 100" You are wrong in this place. In my case, it's the sum of %sys per each CPU is <100%, not average. Average is in "all" row and is below 6% as 100/16=6.25. Good finding again! The conclusion from this analysis is following:
                -----
                -----
                *Average CPU time spent by kernel cannot be bigger than 100/(number of CPU)*
                -----
                -----

                You are right saying "If the total sum of each CPU is less than 100..." but it's not my case. In my case CPUs are quite busy with application load. Let's take data from previous measurement's %user column:

                %user     48.88     24     28.8     25.95     34.4     56.8     53.2     51.6     35.2     82.24     30.2     65.2     35.53     65.73     60.8     58.2     74.4

                all(%user)=48.88%
                sum(%user, o..15) = 24 + 28.8 + ... + 58.2 + 74.4 = 782%

                We have 782% user load! It means that almost 8CPU are fully used. all(%user)=48.88% as 782/16=48.47. This is quite ok, as the application is multithreaded. The measurement was taken during throughput test of stateless service deployed on WebLogic. System processed 45.000 requests during 30 minutes of the test.

                I still do no understand why kernel cannot take more than 6% of average CPU time, but is another story. It looks that there is a limit in scheduler - kernel is trying to be so nice for user processes that is not overusing CPU. I still do no understand why with 6% user load (100% as a sum of %sys on all CPU) application's response times are higher. Probably kernel is so busy (already 100% of virtual maximum kernel's utilization) that cannot take care of all its duties and finally some of them (e.g. passing data from network buffers to application) are postponed a little, what is visible as increased response time. Great!

                Good. Well done! Most important now is that I understand this "mystical 6%" as real life limit of kernel's utilization.

                And I have one more question to as in the future.

                Why my load fills only 8CPU out of 16? Is Intel's HyperThreading not good technology for WebLogic executing 50.000 requests during 30min load test? There are no limits on executor threads, and WLS started >16 of them, filling capacity of CPU subsystem.... interesting. It looks that 48.88% means in this case 100%! Interesting findings: 6% of kernel load means 100% and 50% of user load means 100%. This time, it's no simple math - it's pure magic :) Let's keep it for another discussion.

                Many thanks!
                Ryszard

                Edited by: Ryszard Styczynski on Mar 2, 2012 1:41 AM
                • 5. Re: Need to understand how system time is computed by mpstat
                  Dude!
                  I'm not sure if I can follow your numbers and conclusion. I would have to spend more time with it.

                  In general I think there is not too much practical use in analyzing the statistics in order to troubleshooting performance issues, unless you are looking for something specific and have some data to compare it with. A system is very complex and it is not possible to estimate what a system is doing based on the CPU load. What about DMA, etc.

                  I thing the following is quite interesting to understand how the Linux Kernel works: http://oreilly.com/catalog/linuxkernel/chapter/ch10.html

                  Perhaps you can increase your system's I/O by using a different IO scheduler. The Linux kernel normally buffers and sorts all I/O to optimize nearby data, e.g. disk writes. Oracle UEK uses the "deadline" scheduler. "noop" can increase your I/O performance when using external or virtual storage and free up the kernel. Re: I/O scheduler in Oracle Linux 5.7

                  Regarding Hyper-Threading. From what I understand, a CPU can only process one process at a time. Two virtual cores per each physical core allow to run two processes at once, which helps to use the available processing power more efficiently.