|
Duplicate :
|
|
|
Relates :
|
|
|
Relates :
|
|
|
Relates :
|
Customer is using an ER release of 5.0U6
The ER is 1.5.0_06-erdist-2006-02-01. The bug addressed was 6367204
The hardware is:
Dell 4xDual core Xeon
32GB RAM
10x RAID 10 HDD
The OS is Server 2003.
The java version and configuration:
JRE v.1.5.0_06-erdist-20060201
28GB heap
GC options: -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:ParallelGCThreads=7 -XX:NewSize=128M -XX:MaxNewSize=128M
-XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
-XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10
-XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseLargePages
-XX:+DisableExplicitGC
The application is a custom distributed database server based on
TCP/IP and Sleepycat DBJE
The symptoms:
After running smoothly for ~1-4 days straight with
constant but light load, the ParNew GC's jump from ~150 ms every 30
seconds to 5-20 seconds out of every 30 seconds. The start of the
degenerate ParNew GCs seem to mostly (but not always) coincide with the
start of a new CMS mark phase. The general pattern is to spend 20-90%
of the time in young GC, which eventually quiesces down to acceptable
levels after ~4 hours of GC pain (frequently to re-start after the next
CMS sweep).
The load was constant and unvaried from our side, so we
don't see any application-level cause for the degenerate GC performance.
They ran a test with 5.0u8 and the problem seemed to be pushed out.
The time to failure went to 48 hours for the initial 5 second spike and
another day or so to hit the ~20 second spikes.
They were running with large pages so they ran a test without it and with their ER 5.0u6 and the problem seemed to have gone away, but returned many days later.
Turning off large pages seem to have also extended the running but eventually
they still see the problem
|