we have experiencing an strage issue after increasing our heap size and nursery in Production from
-Xms2816M -Xmx2816M -Xns550M to -Xms4096M -Xmx4096M -Xns1024M
We need to resize this because a new application is being promoted, and we have tested it in Pre-PRO
The thing is that after the new size and before deploying the new app,
all the 'Old Collection Reasons' are due to 'Artifical: Cleaning up Finished Threads' and the top of the HEAP (4GB) is never reached
[gcpause][Wed Feb 27 22:29:26 2013] [OC#621] [---] 1498,525 ms (611967,663000-611969,161000) OC
[gcpause][Wed Feb 27 22:29:26 2013] [OC#621] [con] 0,004 ms (611967,663000-611967,663000) OC:PreGC
[gcpause][Wed Feb 27 22:29:26 2013] [OC#621] [pau] 1453,742 ms (611967,663000-611969,116000) OC:Main
[gcpause][Wed Feb 27 22:29:26 2013] [OC#621] [con] 43,411 ms (611969,118000-611969,161000) OC:PostGC
[memory ][Wed Feb 27 22:29:26 2013] [OC#621] 611967,663-611969,161: OC 2982855KB->1227524KB (4194304KB), 1,499 s, sum of pauses 1453,742 ms, longest pause 1453,742 ms.
After examinig the correponding JFR we haven't any issue or hotspot in the code that could justify this behaviour, in fact the same that works right with 2,8 GB HEAP
Googling hasn't given us any answer and the Support mates are requiring us some more information, but they say this is normal (something that I don't agree at all)
Any suggestion will be welcome ... thanks in advance
java version "1.5.0_26"
Java(TM) Platform, Standard Edition for Business (build 1.5.0_26-b03)
Oracle JRockit(R) (build R28.1.1-14-139783-1.5.0_26-20101206-0241-linux-x86_64, compiled mode)
HT is: not supported by the CPU, not enabled by the OS, not enabled in JRockit.
CPU: Intel Core 2 SSE SSE2 SSE3 SSSE3 SSE4.1 Core Intel64
Family: Core 2
Model: Core 2
Name: Intel(R) Xeon(R) CPU X7460 @ 2.66GHz
Supports: On-Chip FPU
Supports: Virtual Mode Extensions
Supports: Debugging Extensions
Supports: Page Size Extensions
Supports: Time Stamp Counter
Supports: Model Specific Registers
Supports: Physical Address Extension
Supports: Machine Check Exceptions
Supports: CMPXCHG8B Instruction
Supports: On-Chip APIC
Supports: Fast System Call
Supports: Memory Type Range Registers
Supports: Page Global Enable
Supports: Machine Check Architecture
Supports: Conditional Mov Instruction
Supports: Page Attribute Table
Supports: 36-bit Page Size Extension
Supports: the CLFLUSH Instruction
Supports: the Debug Trace Store feature
Supports: ACPI registers in MSR space
Supports: Intel Architecture MMX Technology
Supports: Fast Float Point Save and Restore
Supports: Streaming SIMD extensions
Supports: Streaming SIMD extensions 2
Supports: Hyper Threading
Supports: Thermal Monitor
Supports: Streaming SIMD Extensions 3
Supports: 64-bit DS Area
Supports: MONITOR/MWAIT instructions
Supports: Virtual Machine Extensions
Supports: Enhanced Intel SpeedStep technology
Supports: Thermal Monitor 2
Supports: Supplemental Streaming SIMD Extensions 3
Supports: xTPR Update Control
Supports: Perfmon and Debug Capability
Supports: Direct Cache Access
Supports: Streaming SIMD extensions 4.1
Supports: LAHF/SAHF instruction support
Supports: Intel 64 Architecture
[memory ][Wed Feb 20 20:29:57 2013] Running with 32 bit heap and compressed references supporting 32GB heap.
[memory ][Wed Feb 20 20:29:57 2013] GC mode: Garbage collection optimized for throughput, strategy: Generational Parallel Mark & Sweep.
"... and the top of the HEAP (4GB) is never reached"
You could try watching the gctrigger (is the amount of free heap space that should be available when a concurrent garbage collection starts).
You can monitor the gctrigger value by using the output of the -Xverbose:memdbg (which seen from the logging you have already enabled).
The gctrigger can be set by using the -XXgcTrigger (http://docs.oracle.com/cd/E15289_01/doc.40/e15062/optionxx.htm#BABFCBCI).
This ('Old Collection Reasons' are due to 'Artifical: Cleaning up Finished Threads') is somewhat strange, you would expect as you noticed the
old collections to be triggered by failed object allocations (as the heap is not fully utilized as you noticed).
The word 'Artificial' looks like it is triggered by some monitoring tool used by JRockit, for example, when using the flight recorder
(http://docs.oracle.com/cd/E15289_01/doc.40/e15070/introduction.htm#CACCJBBB), message like: GC reason: Artificial, description: JFR Heap Block Statistics
Rene, thanks a lot for the quick response (or may I say ACE ?)
but two points:
- as I understand GcTrigger only applis for concurrent GC, and in fact we are having genpar (parallel)
( From the doc:
The garbage collector ignores the -XXgcTrigger value when it runs both parallel mark and parallel sweep, for example if you specify -Xgc:singlepar or -Xgc:genpar on the command line. )
- about the GC cause, we have obtained that from JFR and says TEXTUALLY 'Artifical: Cleaning up Finished Threads', check that says ARTIFICAL and that mentions cleaning finished threads, we really don't see drecreasing the number of threads, it keeps as allways (with 2,8GB)
Finally, you recommend to set memdebug to inspect deeply the GC reason, isn't it ?
Right now we're not able to see what's the code that generates HEAP demand as we do usually (GC General > GC Request Trace) but we need to increase HEAP size to cover application expectations
"as I understand GcTrigger only applis for concurrent GC, and in fact we are having genpar (parallel)"
Yep, "The garbage collector ignores the -XXgcTrigger value when it runs both parallel mark and parallel sweep, for example if you specify -Xgc:singlepar or -Xgc:genpar on the command line."
Why not set a dynamic collection scheme (http://docs.oracle.com/cd/E15289_01/doc.40/e15062/optionx.htm#i1011879):
throughput - The garbage collector is optimized for application throughput. This means that the garbage collector works as effectively as possible, giving as much CPU resources to the Java threads as possible. This might, however, cause nondeterministic pauses when the garbage collector stops all Java threads for garbage collection.The throughput priority should be used when non-deterministic pauses do not impact the application's behavior.
pausetime - The garbage collector is optimized for short pauses. This means that the garbage collection works concurrently with the Java application when necessary, in order to avoid pausing the Java threads. This inflicts a slight performance overhead to the application, as the concurrent garbage collector demands more system resources (CPU time and memory) than the parallel garbage collector that is used for optimal throughput. The target pause time is by default 500 msec. To change the default pause target, see -XpauseTarget.
On an application server going for throughput gives good results, when you want to reduce the stop-the-world times it might be beneficial to use pausetime with an appropriate pausetarget.
"...we need to increase HEAP size to cover application expectations"
It should be fine to increase the heap size (we are running JRockit with 16G in certain environments, with the pausetime as the dynamic garbage collection setting), i.e.,
-Xms16g -Xmx16g -Xns1024m -Xgc:pausetime -XpauseTarget:200ms (in this case the application contains a lot of session data which is placed in the old generation hence
the choice for 16G) as long as you keep the live data below 2/3 of the heap, the collection time introduced should be near the 200millis.
http://middlewaremagic.com/weblogic/?page_id=7132 (contains a bunch of posts on JVM tuning)