3 Replies Latest reply: Jul 15, 2012 1:36 AM by 946149 RSS

    High idle kernel CPU usage, continues to ramp up until system crash.

    931708
      Hardware list:
      ASUS P8B-M LGA 1155 Intel C204 Micro ATX Intel Xeon E3 Server Motherboard
      Intel G620 processor @2.6Ghz
      2x 4GB Kingston memory.
      Solaris 11

      My server keeps crashing about every week. When I restart the kernel usage is around 2%, but as the days progress it keeps growing higher and higher. After 12 hours the kernel cpu usage is about 25%, at 24hrs I'm at 45%. This is while idle, no file transfers are going on. I use the box for a file server to multiple windows machines and run Ubuntu through Virtual box with a teamspeak server installed.
      It seems like the <xcalls> unix`dtrace_xcall_func is eating up lots of cpu time. I think its a problem with intel SPEEDSTEP.
      I have found this fix for Open Solaris:
      Workaround should be to enable cpu
      power management in poll mode,
      with "cpupm enable poll-mode"
      in /etc/power.conf, followed by
      running pmconfig:

      But apparently changing settings in /etc/power.conf doesn't do anything in Solaris 11.

      Does anyone know of a solution to this problem? I have searched for a fix, but cannot seem to find one.

      This shows the server at 24hrs uptime:
      top:
      load averages: 0.32, 0.38, 0.45; up 0+23:04:55 00:44:07
      117 processes: 115 sleeping, 1 running, 1 on cpu
      CPU states: 52.5% idle, 1.5% user, 46.0% kernel, 0.0% iowait, 0.0% swap
      Kernel: 1347 ctxsw, 229 trap, 771451 intr, 1651 syscall, 201 flt
      Memory: 8182M phys mem, 3245M free mem, 2048M total swap, 2048M free swap

      PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
      5583 leftside 87 54 0 288M 262M run 162:59 9.28% VBoxHeadless
      1492 leftside 23 59 0 171M 118M sleep 6:04 0.31% java
      5595 root 62 2 19 429M 394M sleep 11:23 0.25% java
      920 root 1 29 10 20M 18M sleep 14:11 0.23% perl
      656 root 27 59 0 34M 17M sleep 0:03 0.10% fmd
      9566 leftside 1 59 0 4332K 2656K cpu/1 0:00 0.02% top
      9841 root 1 29 10 8868K 1200K sleep 0:00 0.02% sleep
      918 root 1 29 10 11M 8568K sleep 0:08 0.01% ksh93
      13 root 22 59 0 20M 19M sleep 0:32 0.01% svc.configd
      5581 leftside 9 59 0 28M 16M sleep 0:27 0.00% VBoxSVC
      662 root 17 59 0 13M 9076K sleep 0:11 0.00% smbd
      1505 leftside 2 59 0 30M 15M sleep 0:14 0.00% nwam-manager
      914 leftside 3 59 0 57M 42M sleep 1:11 0.00% Xorg
      1503 leftside 1 59 0 7420K 5276K sleep 0:02 0.00% xscreensaver
      1498 leftside 1 12 19 59M 36M sleep 0:16 0.00% updatemanagerno
      1497 leftside 1 59 0 136M 22M sleep 0:13 0.00% isapython2.6
      1496 leftside 1 59 0 131M 19M sleep 0:05 0.00% gnome-power-man
      9408 leftside 1 59 0 20M 5484K sleep 0:00 0.00% sshd
      412 root 34 59 0 14M 4468K sleep 0:06 0.00% nscd
      919 root 1 29 10 12M 9920K sleep 0:04 0.00% perl
      1395 root 1 59 0 6156K 2088K sleep 0:03 0.00% sendmail
      46 netcfg 4 59 0 3848K 2840K sleep 0:01 0.00% netcfgd
      563 daemon 5 59 0 11M 2996K sleep 0:01 0.00% idmapd
      253 root 6 59 0 11M 5528K sleep 0:03 0.00% devfsadm
      628 root 1 59 0 9628K 1772K sleep 0:02 0.00% in.routed
      1487 leftside 2 49 0 146M 51M sleep 0:10 0.00% nautilus
      1513 leftside 1 59 0 126M 13M sleep 0:08 0.00% vino-server
      11 root 14 59 0 22M 13M sleep 0:03 0.00% svc.startd
      1484 leftside 2 59 0 136M 36M sleep 0:02 0.00% gnome-panel
      1516 leftside 1 59 0 32M 19M sleep 0:02 0.00% clock-applet


      POWERTOP:
      Solaris PowerTOP version 1.2

      C-states (idle power) Avg Residency P-states (frequencies)
      C0 (cpu running) (86.1%) 1600 Mhz 73.6%
      C1 0.8ms (2.8%) 1700 Mhz 0.0%
      C2 1.0ms (2.2%) 1800 Mhz 0.0%
      C3 1.1ms (8.9%) 1900 Mhz 0.0%
      2000 Mhz 0.0%
      2100 Mhz 0.0%
      2200 Mhz 0.0%
      2300 Mhz 0.0%
      2400 Mhz 0.0%
      2500 Mhz 0.0%
      2600 Mhz 26.4%

      Wakeups-from-idle per second: 1080.1 interval: 5.0s
      no ACPI power usage estimate available

      Top causes for wakeups:
      33.3% (359.2) sched : <xcalls> unix`dtrace_xcall_func
      18.7% (202.4) <kernel> : vboxdrv`rtR0SemSolWaitTimeout
      10.7% (115.3) <kernel> : genunix`cv_wakeup
      9.3% (100.6) <kernel> : genunix`realitexpire
      9.3% (100.4) <kernel> : genunix`clock
      4.8% ( 52.0) <kernel> : genunix`lwp_timer_timeout
      4.3% ( 46.8) <kernel> : SDC`sysdc_update
      3.3% ( 35.5) sched : <xcalls> unix`speedstep_pstate_transitio
      2.2% ( 23.4) <interrupt> : e1000g#0
      1.0% ( 11.1) VBoxHeadless : <xcalls> unix`speedstep_pstate_transitio
      0.7% ( 7.5) <kernel> : ehci`ehci_handle_root_hub_status_change
      0.5% ( 5.0) <kernel> : c2audit`au_queue_kick
      0.4% ( 4.0) <kernel> : genunix`schedpaging
      0.3% ( 3.0) perl : <xcalls> unix`speedstep_pstate_transitio
      0.2% ( 2.0) <kernel> : cpudrv`cpudrv_monitor_disp
      0.2% ( 2.0) <kernel> : rpcmod`stp_flow_control
      0.1% ( 1.6) java : <xcalls> unix`speedstep_pstate_transitio
      0.1% ( 1.4) <kernel> : ip`tcp_timer_callback
      0.1% ( 1.2) <kernel> : sd`sd_pm_idletimeout_handler
      0.1% ( 1.0) <kernel> : e1000g`e1000g_local_timer
      0.1% ( 1.0) <interrupt> : ehci#0
      0.1% ( 1.0) <kernel> : TS`ts_update
      0.1% ( 1.0) <interrupt> : ehci#1
      0.1% ( 0.8) <kernel> : ip`squeue_fire
      0.0% ( 0.2) <kernel> : ip`nce_timer
      0.0% ( 0.2) <kernel> : ip`mld_slowtimo
      0.0% ( 0.2) <kernel> : ip`igmp_slowtimo
      0.0% ( 0.2) <kernel> : ahci`ahci_watchdog_handler

      "sudo lockstat -gkIW sleep 60 | less"

      Profiling interrupt: 11672 events in 60.154 seconds (194 events/sec)

      Count genr cuml rcnt nsec Hottest CPU+PIL Caller
      -------------------------------------------------------------------------------
      8152 70% ---- 0.00 654 cpu[0]+10 switch_sp_and_call
      8149 70% ---- 0.00 654 cpu[0]+10 dispatch_softint
      3073 26% ---- 0.00 679 cpu[1] thread_start
      3028 26% ---- 0.00 682 cpu[1] idle
      3010 26% ---- 0.00 682 cpu[1] cpu_idle_adaptive
      3009 26% ---- 0.00 682 cpu[1] cpu_acpi_idle
      1757 15% ---- 0.00 700 cpu[1] acpi_cpu_cstate
      1252 11% ---- 0.00 657 cpu[1] cpu_idle_mwait
      1249 11% ---- 0.00 658 cpu[1] i86_mwait
      347 3% ---- 0.00 564 cpu[1]+2 cyclic_softint
      347 3% ---- 0.00 564 cpu[1]+2 av_dispatch_softvect
      345 3% ---- 0.00 565 cpu[1]+2 callout_realtime
      345 3% ---- 0.00 565 cpu[1]+2 cbe_low_level
      332 3% ---- 0.00 569 cpu[1]+2 do_splx
      325 3% ---- 0.00 567 cpu[1]+2 cyclic_reprogram
      325 3% ---- 0.00 567 cpu[1]+2 cyclic_reprogram_cyclic
      325 3% ---- 0.00 567 cpu[1]+2 cbe_restore_level
      310 3% ---- 0.00 567 cpu[1]+2 callout_heap_delete
      194 2% ---- 0.00 538 cpu[1] sys_syscall
      154 1% ---- 0.00 545 cpu[1] ioctl
      153 1% ---- 0.00 546 cpu[1] fop_ioctl
      153 1% ---- 0.00 546 cpu[0] (usermode)
      152 1% ---- 0.00 547 cpu[1] spec_ioctl
      151 1% ---- 0.00 548 cpu[1] cdev_ioctl
      148 1% ---- 0.00 548 cpu[1] VBoxDrvSolarisIOCtl
      102 1% ---- 0.00 539 cpu[1] supdrvIOCtlFast
      73 1% ---- 0.00 588 cpu[1] 0xfffffffff7e8ba37
      73 1% ---- 0.00 588 cpu[1] 0xfffffffff7e579c4
      61 1% ---- 0.00 605 cpu[1] 0xfffffffff7e52fad
      58 0% ---- 0.00 526 cpu[0] mutex_vector_enter
      55 0% ---- 0.00 602 cpu[0]+11 swtch
      52 0% ---- 0.00 504 cpu[0] syssysenter_post_swapgs
      47 0% ---- 0.00 513 cpu[0]+11 cv_waituntil_sig
      45 0% ---- 0.00 575 cpu[0]+11 supdrvIOCtl
      43 0% ---- 0.00 565 cpu[0]+11 0xfffffffff7e5875b
      40 0% ---- 0.00 529 cpu[0] untimeout_generic
      35 0% ---- 0.00 581 cpu[0]+11 0xfffffffff7e57dfe
      34 0% ---- 0.00 556 cpu[1]+2 callout_expire
      33 0% ---- 0.00 557 cpu[1]+2 callout_list_expire
      32 0% ---- 0.00 507 cpu[0]+11 sigtimedwait
      32 0% ---- 0.00 515 cpu[0] mutex_delay_default
      31 0% ---- 0.00 518 cpu[0] untimeout_default
      30 0% ---- 0.00 503 cpu[0]+11 cv_wait_sig_swap
      30 0% ---- 0.00 600 cpu[0]+11 0xfffffffff7e4ab87
      29 0% ---- 0.00 512 cpu[0]+11 cv_wait_sig_swap_core

      Edited by: 928705 on Apr 18, 2012 10:58 PM
        • 1. Re: High idle kernel CPU usage, continues to ramp up until system crash.
          931708
          I just disabled all of the c states in bios, and disabled speedstep to see if this alleviates the problem. At least it should in theory take longer for the kernel idle time to ramp up just because the CPU is now running at a constant 2600mhz instead of 1600mhz.

          I've found an incredibly long form from someone using Open Indiana who seemed to have the same problem I am having.

          https://www.illumos.org/issues/1333
          • 2. Re: High idle kernel CPU usage, continues to ramp up until system crash.
            946149
            I'm seeing the same problem (have the same motherboard but 16Gig of RAM and running Xeon 1260L).

            I have changed the APIC timer by adding the line: set apix:apic_timer_preferred_mode = 0x0 to /etc/system (according to the thread that you linked to). This doesn't really seem to help the situation, the idle kernal-time is still going up and I also notice that available memory slowly drops too.

            Which exact settings did you change in the BIOS ? Do you still see the problem ?
            • 3. Re: High idle kernel CPU usage, continues to ramp up until system crash.
              946149
              A little update. All works fine after doing this;

              ~# echo apic_timer::print apic_timer_t | sudo mdb -k
              mode 0x2 gives the problem, should be mode 0x0

              Changing the mode of the APIC timer by adding the line to /etc/system :
              set apix:apic_timer_preferred_mode = 0x0

              Then the system needs to be rebooted. Verify with:
              ~# echo apic_timer::print apic_timer_t | sudo mdb -k


              Case closed.