i am running into an gc issue wherein we see multiple promotion failure statements in gc log.
these promotion failure statements in gc logs also shows CMS pause which i beilieve is for compacting the heap (?) and pause time seem to be equally bad as that of full gc.
i suspect this is due to fragmentation of the heap.
i have following questions:
1) does promotion failure happen during copying objects from eden -> survivor or from survivor -> tenure generation?
2) is there any way to avoid fragmentation or use heap compaction with CMS collector?
3) if we use default gc parameters and collector, is parallel compaction turned on by default or do we need to explicitly turn it on? im using jdk 6
4) any tools (preferably opensource) that allows us to analyze heap segments in detail like what objects reside on which segment during one heap snapshot. i have checked some of the profilers, visualvm, visualgc but they did not show detailed segment specific analysis of heap apart from size of these segments and and gc activity.
platform: linux (tried both 32 and 64 bit)
jvm: sun jdk 6 server class vm
heap size: max, min 3g, young gen 512m (as we create large number of temp objects), survivor ratio: 8
im bit surprised not to find suitable forum to discuss gc topics. is there any other forum category i need to cross post this thread on?
1) Promotion failure can occur from eden to survivor when the survivor space is not large enough to take all the survivors. Not sure if you can get a failure into the tenured space (could be possible)
2) The Parallel GC and G1 collector support heap compaction. The later collector is somewhat experimental.
3) The Parallel GC is the default.
4) Not AFAIK.
The basic problem appear to be that your eden/surprivor space is not large enough to ensure that the number of object passing to the tenured space is relatively small. You get these problems when too many objects are passing from eden to tenured and then needing to cleaned up. Tenured works best for long term objects. You should also look at using less memory and shorter lived objects. Or increasing your young gen size. A memory profiler should help you with those.
Try using a young gen size of 2g or more and a survivor ratio of 4 or less. If that works try reducing it to see at what point you start seeing failures and full gcs.
BTW: Your temp objects don't appear to be very short lived and appear to get moving to the tenured space. Changing GC would change this problem (only the impact of a memory fragmentation)
The 'gc' tag is more often on the HotSpot Virtual Machine forum, but also appears across many fora. ;)
thanks peter for the reply.
i guess G1 will take some time before it is stable. AFAIK currently it may not necessarily perform better than CMS.
as far as increasing young gen size, on the contrary when i was testing using 1g earlier, i ran into double the amount of promotion failures.
so it does not look that straightforward. too high or too low wont help. allocating too much space to young gen will eat up tenured space and then there is a chance of hitting full gc more often than now. reaching golden number will take some time. i have used profilers for this excercise. analyzing heap snapshots do reveal segment size but its only for that point in time, if i can see segment-wise objects content, it can help me do some more rational calculations on general space required for segment.
If your new gen size is not large enough I would expect the promotions to be proportional to the size of the new gen up tot he point new gen size takes longer to fill than the promoted objects live. If the new gen is large enough you will get next to no promoted objects because there won't be any left alive.
The G1 collector works well if fragmentation of the tenured space is an issue.
I still believe your key problem is you are creating too many medium lived objects.