2 Replies Latest reply on Sep 20, 2010 1:32 PM by 843798

    high cpu usage but not sure why


      In the enterprise where i work we've been experiencing some problems. The cpu usage grows up to 100% and keeps so until we restart it. We are running a zk framework based web app on a tomcat server running on linux with jdk 1.6.0_12 and postgres as dbms.
      We made many java dumps and used Thread Dump Analyzer and confirmed it's not just an infinite loop in our code (that was the first thing we thought).

      We saw that each time that happened, a connection with the postgres was kept open and appeared idle in transaction:
      postgres  6833 19172  0 08:00 ?        00:00:17 postgres: hco hco idle in transaction
      I guess that should be related somehow, but i really don't know if it's related with the reason of the problem or is a consequence of it.

      This problem usually starts at 9 o clock in the morning, that's when clients start connecting.

      Something that called our attention too was that there were 75 sockets (java.io, not java.nio) open and they are always RUNNABLE. I guess socket reading threads should be blocked waiting, not runnable.
      "TP-Processor219" daemon prio=10 tid=0x0a854400 nid=0x30a7 runnable [0x271fe000..0x271fefb0]
         java.lang.Thread.State: RUNNABLE
              at java.net.SocketInputStream.socketRead0(Native Method)
              at java.net.SocketInputStream.read(SocketInputStream.java:129)
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
              at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
              - locked <0xaf3633d0> (a java.io.BufferedInputStream)
              at org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:621)
              at org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:559)
              at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:686)
              at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
              at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
              at java.lang.Thread.run(Thread.java:619)
      Is that normal?. I read http://stackoverflow.com/questions/2978644/why-does-javas-socket-connect-consume-100-cpu-resources about a similar situation, but I don't know how to know what's the thread that is consuming most of the processing. How could i know it?.

      I'm sure we are not seeing something. I have some thread dumps taken during these failures if needed (i didn't include them here because they are too big).
        • 1. Re: high cpu usage but not sure why
          Attach VisualVM and profile the application. That will give you the hotspots. Sounds like you somewhere has busy waiting. You should also set VM parameters so that you print GC times. It's possible that you are running low on memory, and that you get lots of full GC.
          • 2. Re: high cpu usage but not sure why
            Thanks for answering fast, and sorry i haven't answered before.
            This error is random and (luckily) hasn't repeated since then.

            These are jvm non-defaults parameters:
            -server -Xmx2048m -XX:MaxPermSize=128m 
            Just in case it's useful, these are the last lines of some thread dumps we got from different moments when it failed:
             PSYoungGen      total 158080K, used 126335K [0xa6a70000, 0xb18e0000, 0xb4df0000)
              eden space 137344K, 81% used [0xa6a70000,0xad74e460,0xaf090000)
              from space 20736K, 71% used [0xaf090000,0xaff11b90,0xb04d0000)
              to   space 20416K, 0% used [0xb04f0000,0xb04f0000,0xb18e0000)
             PSOldGen        total 1864192K, used 1810276K [0x34df0000, 0xa6a70000, 0xa6a70000)
              object space 1864192K, 97% used [0x34df0000,0xa35c92a8,0xa6a70000)
             PSPermGen       total 60544K, used 58418K [0x2cdf0000, 0x30910000, 0x34df0000)
              object space 60544K, 96% used [0x2cdf0000,0x306fc990,0x30910000)
             PSYoungGen      total 163904K, used 129463K [0xa6a70000, 0xb33f0000, 0xb4df0000)
              eden space 124544K, 92% used [0xa6a70000,0xadad34e8,0xae410000)
              from space 39360K, 36% used [0xb0d80000,0xb1b8a810,0xb33f0000)
              to   space 40896K, 0% used [0xae410000,0xae410000,0xb0c00000)
             PSOldGen        total 1864192K, used 1787748K [0x34df0000, 0xa6a70000, 0xa6a70000)
              object space 1864192K, 95% used [0x34df0000,0xa1fc92e8,0xa6a70000)
             PSPermGen       total 60544K, used 57626K [0x2cdf0000, 0x30910000, 0x34df0000)
              object space 60544K, 95% used [0x2cdf0000,0x30636838,0x30910000) 
             PSYoungGen      total 209088K, used 197549K [0xa6a30000, 0xb4db0000, 0xb4db0000)
              eden space 187584K, 97% used [0xa6a30000,0xb1c396a8,0xb2160000)
              from space 21504K, 70% used [0xb2160000,0xb3041e30,0xb3660000)
              to   space 21120K, 0% used [0xb3910000,0xb3910000,0xb4db0000)
             PSOldGen        total 1864192K, used 1401715K [0x34db0000, 0xa6a30000, 0xa6a30000)
              object space 1864192K, 75% used [0x34db0000,0x8a68cc50,0xa6a30000)
             PSPermGen       total 55296K, used 52315K [0x2cdb0000, 0x303b0000, 0x34db0000)
              object space 55296K, 94% used [0x2cdb0000,0x300c6e98,0x303b0000) 
            I think there's still available heap space (anyway, I don't know very much about GC).

            Could garbage collection keep you processor at 100% forever? It's been like that for hours (from 9 in the morning until 2 am the next day when we finally were allowed to restart it), and it had lot of time available without any use.