1 2 Previous Next 15 Replies Latest reply: Jun 9, 2011 4:59 PM by 521469 RSS

    Full GC takes long time

    3004
      Hello,

      Since the last two weeks, we're experiencing Full GCs that take long time and block our application.

      Our platform is:
      Windows 2003 Server 64 bit
      Tomcat 6.0.18
      Java 1.6.0_u13.

      Total memory: 8Gb.
      -Xms4096m
      -Xmx6144m
      -Xss256k

      Old generation: min 2,66Gb max 4,0G
      Eden generation: min 1,5Gb max 2,0G
      Perm generation: min 57Mb max 84Mb


      Here a piece of our gc.log:

      21601.266: [GC [PSYoungGen: 2009225K->6206K(1942528K)] 3913618K->1913365K(4738752K), 0.0992347 secs] [Times: user=0.16 sys=0.03, real=0.11 secs]
      21611.268: [GC [PSYoungGen: 1942485K->5485K(1879040K)] 3849645K->1914490K(4675264K), 0.0986129 secs] [Times: user=0.11 sys=0.00, real=0.11 secs]
      21618.350: [GC [PSYoungGen: 1879021K->12392K(2060928K)] 3788026K->1930794K(4857152K), 0.2036000 secs] [Times: user=0.14 sys=0.02, real=0.20 secs]
      21624.869: [GC [PSYoungGen: 1935116K->8545K(2065600K)] 3853519K->1930702K(4861824K), 0.0924182 secs] [Times: user=0.05 sys=0.00, real=0.09 secs]
      21624.961: [Full GC (System) [PSYoungGen: 8545K->0K(2065600K)] [PSOldGen: 1922156K->635750K(2796224K)] 1930702K->635750K(4861824K) [PSPermGen: 38643K->38643K(60992K)], 1.3719842 secs] [Times: user=1.09 sys=0.00, real=1.36 secs]
      21634.887: [GC [PSYoungGen: 2048500K->10395K(2078912K)] 2684250K->646145K(4875136K), 0.0180670 secs] [Times: user=0.02 sys=0.05, real=0.01 secs]
      21645.165: [GC [PSYoungGen: 2074007K->8120K(2003392K)] 2709758K->648930K(4799616K), 0.0176943 secs] [Times: user=0.08 sys=0.00, real=0.03 secs]

      (...)

      25209.938: [GC [PSYoungGen: 1760190K->7344K(1707072K)] 4127764K->2376892K(4503296K), 0.0760053 secs] [Times: user=0.11 sys=0.03, real=0.06 secs]
      25218.120: [GC [PSYoungGen: 1707056K->10511K(1657728K)] 4076604K->2382964K(4453952K), 0.0704081 secs] [Times: user=0.13 sys=0.00, real=0.08 secs]
      25221.680: [GC [PSYoungGen: 1657679K->5731K(1982400K)] 4030132K->2385544K(4778624K), 0.0985384 secs] [Times: user=0.20 sys=0.00, real=0.09 secs]
      25226.968: [GC [PSYoungGen: 993426K->6083K(1982784K)] 3373239K->2389543K(4779008K), 0.5147901 secs] [Times: user=0.08 sys=0.00, real=0.52 secs]
      25227.483: [Full GC (System) [PSYoungGen: 6083K->0K(1982784K)] [PSOldGen: 2383460K->536034K(2796224K)] 2389543K->536034K(4779008K) [PSPermGen: 39765K->38756K(58816K)], 155.4030025 secs] [Times: user=1.34 sys=0.64, real=155.39 secs]
      25389.833: [GC [PSYoungGen: 1976624K->12915K(2023616K)] 2512658K->566671K(4819840K), 13.1263397 secs] [Times: user=0.11 sys=0.03, real=13.13 secs]
      25405.871: [GC [PSYoungGen: 2023579K->13857K(2031552K)] 2577335K->583736K(4827776K), 3.5796784 secs] [Times: user=0.03 sys=0.00, real=3.58 secs]
      25416.490: [GC [PSYoungGen: 2024544K->9059K(2074944K)] 2594423K->587581K(4871168K), 0.1184355 secs] [Times: user=0.06 sys=0.00, real=0.13 secs]
      25425.393: [GC [PSYoungGen: 2064144K->7415K(2075456K)] 2642665K->590222K(4871680K), 0.0441002 secs] [Times: user=0.02 sys=0.00, real=0.05 secs]
      25438.547: [GC [PSYoungGen: 2062519K->8857K(2075264K)] 2645326K->594735K(4871488K), 0.0301620 secs] [Times: user=0.01 sys=0.00, real=0.03 secs]


      The minor collections go perfect.
      The first full GC in this piece of log takes 1,36s, and the old generation passes from 1922156K->635750K
      But the second full GC (1 hour later) blocks the server for 155.39s, and the old generation pases from 2383460K->536034K

      The server load between execution of full GC was similar. Any differences I saw in this two moments:

      first full gc: number of Tomcat http threads : 230. The full gc provokes that Tomcat start new threads until get 500 (maxThreads)
      second full gc: number of Tomcat http threads : 500.

      I'm not an expert in tunning jvm. If somebody could help me, these pauses are killing the app and it's unacceptable for my client (for 2 min. the app does not work).
      If you need more information (threads stack, histograms, ... I have ...)

      Maybe I need to reduce the size of the old generation? Or change the garbage collector for old generation? The server cpu is going well, I prefer to consume more cpu in gc and avoid this pauses.

      Any help will be very very appreciated.

      Thanks in advance,

      Joan.
        • 1. Re: Full GC takes long time
          3004
          Hello,

          One more thing. This log I've pasted is from this afternoon. Well, now the jvm has executed another full GC (always one hour between GCs, more or less). This full GC goes after the last Full GC. This is the trace:

          28961.987: [GC [PSYoungGen: 2054017K->6914K(1983424K)] 4257293K->2214300K(4779648K), 0.0728403 secs] [Times: user=0.11 sys=0.00, real=0.08 secs]
          28970.603: [GC [PSYoungGen: 1983348K->6986K(1918976K)] 4190734K->2216438K(4715200K), 0.0667551 secs] [Times: user=0.16 sys=0.00, real=0.08 secs]
          28979.165: [GC [PSYoungGen: 1918919K->8004K(1858304K)] 4128371K->2220140K(4654528K), 0.0686901 secs] [Times: user=0.16 sys=0.00, real=0.08 secs]
          28983.402: [GC [PSYoungGen: 1027237K->5370K(1855616K)] 3239373K->2221123K(4651840K), 0.0664194 secs] [Times: user=0.14 sys=0.00, real=0.08 secs]
          28983.469: [Full GC (System) [PSYoungGen: 5370K->0K(1855616K)] [PSOldGen: 2215753K->689144K(2796224K)] 2221123K->689144K(4651840K) [PSPermGen: 38957K->38957K(57088K)], 1.5134663 secs] [Times: user=1.19 sys=0.11, real=1.50 secs]
          28993.228: [GC [PSYoungGen: 1850211K->10332K(1801600K)] 2539355K->699477K(4597824K), 0.0239241 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
          29000.337: [GC [PSYoungGen: 1801561K->6549K(1741376K)] 2490706K->701511K(4537600K), 0.0249274 secs] [Times: user=0.05 sys=0.00, real=0.03 secs]


          Just 1,50s, and old generation passes from 2215753K->689144K.

          These 3 Full GC are almost equals in terms of old generation space, but the second took 153s, and the first and third just a bit more than one second.

          Is this helpful?

          Thanks,

          Joan.
          • 2. Re: Full GC takes long time
            3004
            Hello,

            One more thing.
            Just with the second full gc, I was executing the VisualGC of the jVisualVM utility. Can this utility be interfiering with the executiong of the full GC?

            Joan.
            • 3. Re: Full GC takes long time
              3004
              Hello,

              And here the fourth full gc, exacly one hour later:

              32578.279: [GC [PSYoungGen: 1734788K->9151K(1680832K)] 4367089K->2643766K(4477056K), 0.0687853 secs] [Times: user=0.09 sys=0.00, real=0.08 secs]
              32585.482: [GC [PSYoungGen: 1669936K->11686K(1683392K)] 4304550K->2648422K(4479616K), 0.0373979 secs] [Times: user=0.09 sys=0.00, real=0.05 secs]
              32585.520: [Full GC (System) [PSYoungGen: 11686K->0K(1683392K)] [PSOldGen: 2636735K->708123K(2796224K)] 2648422K->708123K(4479616K) [PSPermGen: 39055K->39055K(54976K)], 4.3858719 secs] [Times: user=1.17 sys=0.00, real=4.38 secs]
              32593.221: [GC [PSYoungGen: 1671656K->11317K(1631680K)] 2379779K->719440K(4427904K), 0.0342173 secs] [Times: user=0.08 sys=0.00, real=0.05 secs]
              32599.246: [GC [PSYoungGen: 1631647K->10590K(1581824K)] 2339770K->722683K(4378048K), 0.0256662 secs] [Times: user=0.03 sys=0.00, real=0.03 secs]


              It took 4 seconds, and old passes from 2636735K->708123K.

              My application is a cache of webservices, and many objects are cached for hours. Are these objects stored in the old generation? Can they cause a low execution in gc? Because the minor collections just take 0,0x seconds, and, in fact, they recover the same space than the full GCs, more or less 1,5Gb. every time.

              Thanks,

              Joan.
              • 4. Re: Full GC takes long time
                3004
                Hello,

                I forgot to say that this is a 4-core server.

                Reading documentation about tunning jvm, it seems that I have several ways to try solving this problem:

                1. This long gc time just can occur in old or perm generation. I can see that many times perm generation is almost full, and this triggers full gc. I can try to increase permsize, and, at the same time, activate parallelOldGC to perform gc in many threads instead of just one.

                2. If this does not work, try to use CMS in order to avoid (at least) the long pauses. My cpu consuming is not very high, then a reasonable increment of cpu when performing gc is acceptable.

                Coud anybody give me an opinion about my problem??

                Thanks,

                Joan.
                • 5. Re: Full GC takes long time
                  3004
                  Hello,

                  Sorry for my continuous replies. Another full gc has occurred. This is the trace:

                  262631.531: [GC [PSYoungGen: 2071685K->10889K(2078464K)] 4711353K->2652961K(4874688K), 0.3392037 secs] [Times: user=0.08 sys=0.00, real=0.34 secs]
                  262640.608: [GC [PSYoungGen: 2071978K->11055K(2078784K)] 4714051K->2656136K(4875008K), 1.3267436 secs] [Times: user=0.06 sys=0.00, real=1.33 secs]
                  262649.880: [GC [PSYoungGen: 2072165K->17337K(2071872K)] 4717246K->2669454K(4868096K), 3.0493937 secs] [Times: user=0.16 sys=0.00, real=3.05 secs]
                  262658.069: [GC [PSYoungGen: 2071865K->14678K(2075840K)] 4723982K->2676078K(4872064K), 5.4598301 secs] [Times: user=0.08 sys=0.03, real=5.45 secs]
                  262666.592: [GC [PSYoungGen: 2069183K->15835K(2075776K)] 4730583K->2682691K(4872000K), 1.6894894 secs] [Times: user=0.11 sys=0.05, real=1.70 secs]
                  262673.352: [GC [PSYoungGen: 2070620K->14122K(2068992K)] 4737476K->2687230K(4865216K), 2.9711981 secs] [Times: user=0.06 sys=0.03, real=2.97 secs]
                  262682.891: [GC [PSYoungGen: 2068909K->10709K(2075712K)] 4742017K->2686673K(4871936K), 1.1160323 secs] [Times: user=0.19 sys=0.00, real=1.13 secs]
                  262691.733: [GC [PSYoungGen: 2065941K->10683K(2075584K)] 4741905K->2689212K(4871808K), 1.4898942 secs] [Times: user=0.05 sys=0.00, real=1.50 secs]
                  262701.512: [GC [PSYoungGen: 2065888K->9341K(2076480K)] 4744418K->2690073K(4872704K), 1.3835846 secs] [Times: user=0.13 sys=0.00, real=1.39 secs]
                  262709.896: [GC [PSYoungGen: 2065601K->10718K(2076480K)] 4746334K->2693212K(4872704K), 0.7815897 secs] [Times: user=0.05 sys=0.01, real=0.78 secs]
                  262718.681: [GC [PSYoungGen: 2067028K->10199K(2076672K)] 4749523K->2695508K(4872896K), 1.2951014 secs] [Times: user=0.06 sys=0.00, real=1.30 secs]
                  262727.940: [GC [PSYoungGen: 2067345K->10112K(2076608K)] 4752654K->2698129K(4872832K), 1.6785345 secs] [Times: user=0.08 sys=0.00, real=1.69 secs]
                  262736.092: [GC [PSYoungGen: 2067264K->10886K(2077184K)] 4755281K->2702237K(4873408K), 1.1985764 secs] [Times: user=0.06 sys=0.02, real=1.20 secs]
                  262744.626: [GC [PSYoungGen: 2069382K->11366K(2077248K)] 4760733K->2705795K(4873472K), 1.4323653 secs] [Times: user=0.13 sys=0.00, real=1.44 secs]
                  262754.913: [GC [PSYoungGen: 2069912K->11945K(2077824K)] 4764342K->2708689K(4874048K), 0.8340946 secs] [Times: user=0.03 sys=0.00, real=0.84 secs]
                  262765.038: [GC [PSYoungGen: 2072283K->11669K(2077888K)] 4769028K->2711204K(4874112K), 1.3040060 secs] [Times: user=0.09 sys=0.02, real=1.31 secs]
                  262776.885: [GC [PSYoungGen: 2072021K->9940K(2079360K)] 4771556K->2711919K(4875584K), 0.8646772 secs] [Times: user=0.09 sys=0.00, real=0.88 secs]
                  262783.713: [GC [PSYoungGen: 1173115K->8659K(2079168K)] 3875094K->2713127K(4875392K), 1.4434465 secs] [Times: user=0.09 sys=0.00, real=1.45 secs]
                  262785.157: [Full GC (System) [PSYoungGen: 8659K->0K(2079168K)] [PSOldGen: 2704467K->655075K(2796224K)] 2713127K->655075K(4875392K) [PSPermGen: 41376K->40988K(42048K)], 132.6088125 secs] [Times: user=1.47 sys=0.39, real=132.58 secs]
                  262925.100: [GC [PSYoungGen: 2062280K->16892K(2068992K)] 2717355K->683188K(4865216K), 8.9491611 secs] [Times: user=0.13 sys=0.00, real=8.94 secs]
                  262971.183: [GC [PSYoungGen: 2068978K->8052K(2074624K)] 2735273K->692990K(4870848K), 7.7833800 secs]

                  When a "bad" full GC occurs, I can see that just before it the minor collections start to perform bad. Normal values for minor collections are 0,0x seconds, and before a "bad" full GC these minor collections take seconds (1, 2, ... up to 5 seconds).

                  I can show an histogram of the first 25 entries before and after the full GC:

                  BEFORE
                  num #instances #bytes class name
                  ----------------------------------------------
                  1: 226135 718455272 [I
                  2: 1905686 637239584 [C
                  3: 122775 621626112 [B
                  4: 8218 538714736 [Lcom.sun.org.apache.xpath.internal.objects.XObject;
                  5: 160974 445403288 [Ljava.lang.Object;
                  6: 1759928 70397120 java.lang.String
                  7: 334672 42838016 com.sun.org.apache.xerces.internal.dom.DeferredElementNSImpl
                  8: 103592 42699248 [[I
                  9: 227966 41945744 java.net.SocksSocketImpl
                  10: 629974 35278544 com.vpfw.proxy.cache.data.B
                  11: 636515 30552720 java.util.concurrent.ConcurrentHashMap$HashEntry
                  12: 555354 26656992 java.util.HashMap$Entry
                  13: 89059 18034280 [Ljava.util.HashMap$Entry;
                  14: 230432 16591104 com.sun.org.apache.xerces.internal.dom.DeferredTextImpl
                  15: 225401 14425664 java.lang.ref.Finalizer
                  16: 222588 14245632 java.net.SocketInputStream
                  17: 334644 13385760 com.sun.org.apache.xerces.internal.dom.AttributeMap
                  18: 386837 12378784 java.util.concurrent.ConcurrentLinkedQueue$Node
                  19: 40226 11263504 [[Ljava.lang.Object;
                  20: 21646 11246112 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
                  21: 684613 10953808 java.lang.Object
                  22: 227958 9118320 java.net.Socket
                  23: 218591 8743640 java.net.Inet4Address
                  24: 57810 8730528 <constMethodKlass>
                  25: 230658 7381056 java.io.FileDescriptor


                  AFTER
                  num #instances #bytes class name
                  ----------------------------------------------
                  1: 91917 650028776 [B
                  2: 1767088 612264368 [C
                  3: 176313 581838552 [I
                  4: 141278 322132688 [Ljava.lang.Object;
                  5: 1762956 70518240 java.lang.String
                  6: 367122 46991616 com.sun.org.apache.xerces.internal.dom.DeferredElementNSImpl
                  7: 239506 44069104 java.net.SocksSocketImpl
                  8: 633882 35497392 com.vpfw.proxy.cache.data.B
                  9: 639797 30710256 java.util.concurrent.ConcurrentHashMap$HashEntry
                  10: 606366 29105568 java.util.HashMap$Entry
                  11: 72730 20521256 [[I
                  12: 89136 18497696 [Ljava.util.HashMap$Entry;
                  13: 252946 18212112 com.sun.org.apache.xerces.internal.dom.DeferredTextImpl
                  14: 236339 15125696 java.lang.ref.Finalizer
                  15: 233891 14969024 java.net.SocketInputStream
                  16: 366906 14676240 com.sun.org.apache.xerces.internal.dom.AttributeMap
                  17: 429913 13757216 java.util.concurrent.ConcurrentLinkedQueue$Node
                  18: 43319 12129544 [[Ljava.lang.Object;
                  19: 719233 11507728 java.lang.Object
                  20: 21649 11246408 [Ljava.util.concurrent.ConcurrentHashMap$HashEntry;
                  21: 239498 9579920 java.net.Socket
                  22: 143 9317736 [Lcom.sun.org.apache.xpath.internal.objects.XObject;
                  23: 229531 9181240 java.net.Inet4Address
                  24: 57817 8735944 <constMethodKlass>
                  25: 242443 7758176 java.io.FileDescriptor


                  The big differencies is the "com.sun.org.apache.xpath.internal.objects.XObject;".
                  Before and after, the values are:

                  4: 8218 538714736 [Lcom.sun.org.apache.xpath.internal.objects.XObject;
                  22: 143 9317736 [Lcom.sun.org.apache.xpath.internal.objects.XObject;


                  Could this the cause of the full GC that takes long time?

                  Thanks,
                  Joan.
                  • 6. Re: Full GC takes long time
                    3004
                    Hello,

                    I hope that somebody could read these posts and help me... I'm going crazy.

                    I've added the +UseParallelOldGC, and the results seem to be worse:

                    13894.170: [GC [PSYoungGen: 2076385K->10584K(2080064K)] 4833833K->2773435K(4876288K), 0.0838418 secs] [Times: user=0.14 sys=0.05, real=0.09 secs]
                    13896.833: [GC [PSYoungGen: 904106K->7771K(2080128K)] 3666957K->2777443K(4876352K), 0.1076037 secs] [Times: user=0.08 sys=0.03, real=0.11 secs]
                    13896.948: [Full GC (System) [PSYoungGen: 7771K->0K(2080128K)] [ParOldGen: 2769672K->543026K(2796224K)] 2777443K->543026K(4876352K) [PSPermGen: 38532K->38103K(98304K)], 75.1807514 secs] [Times: user=2.58 sys=1.05, real=75.17 secs]
                    13975.117: [GC [PSYoungGen: 2063872K->13692K(2077568K)] 2606898K->556719K(4873792K), 0.0235777 secs] [Times: user=0.06 sys=0.00, real=0.03 secs]
                    13981.302: [GC [PSYoungGen: 2077549K->12020K(2080768K)] 2620576K->562785K(4876992K), 0.0333660 secs] [Times: user=0.09 sys=0.00, real=0.03 secs]

                    75 seconds, and old generation 2769672K->543026K

                    Thanks,
                    Joan.
                    • 7. Re: Full GC takes long time
                      843829
                      Hi,

                      Have you tried to play with GC parameters?
                      -XX:MaxGCPauseMillis, -XX:GCTimeRatio and generation size adjustments may solve your problem.
                      • 8. Re: Full GC takes long time
                        3004
                        Hello,

                        the problem was solved tunning generation sizes.

                        Thanks,

                        Joan.
                        • 9. Re: Full GC takes long time
                          843829
                          Hi

                          Please, how did you solve this issue??

                          I got the same error in my ac4j server instance.

                          Thanks a lot
                          • 10. Re: Full GC takes long time
                            521469
                            Hello,

                            In Tomcat 6 on Java SE 6, I'm experiencing the same problem. The difference is that the first full gc takes long time, that is from 30 seconds to 111 seconds.

                            I'm going to investigate gc trace in detail.

                            Osamu
                            • 11. Re: Full GC takes long time
                              865505
                              Do u mean.... "changing the permanant genaration size have made this issue out....?"

                              This is intresting....
                              • 12. Re: Full GC takes long time
                                802316
                                [Times: user=1.09 sys=0.00, real=1.36 secs]
                                [Times: user=1.34 sys=0.64, real=155.39 secs]
                                It is worth nothing that almost the same about of time was spent doing real work, i.e. the "user" time. The first one took 1.09 s and the second 1.34s to perform the GC. This could have used multiple threads, so the actual time is likely to be shorter. The problem is that to perform a Full GC, all the other threads had to reach a safe point so they could be stopped. A lot depends on how long it takes for all your threads to reach a safepoint (depends on what the threads are doing at the time) which why would see wild variation in the time for a Full GC. e.g. do you have a thread spending a long time in a native library?

                                The first thing I would do is upgrade your JVM. Java 6 update 13 is pretty old and there have been some significant optimisations since then.

                                I would also look at the Concurrent Mark Sweep collector which may behave very differently. (It may not have this problem at all)
                                • 13. Re: Full GC takes long time
                                  521469
                                  Hi,

                                  Sorry, let me correct it. It was same problem. The second full GC only takes a long time.

                                  Osamu
                                  • 14. Re: Full GC takes long time
                                    521469
                                    Hello,

                                    The environment is below:
                                    Windows 2008 Server
                                    Java SE v6 (latest), GC is default collector(serial)
                                    Tomcat v6 (latest)

                                    Between the first GC and the second one, the application makes a huge volume of objects. So I guess that the GC's first marking of objects takes a long time.

                                    Is it correct?
                                    Can I trace the GC's marking time and sweeping time, respectively?

                                    Thanks in advance.

                                    Osamu
                                    1 2 Previous Next