Bug ID: JDK-6433335 ParNewGC times spiking, eventually taking up 20 out of every 30 seconds

JDK-6433335 : ParNewGC times spiking, eventually taking up 20 out of every 30 seconds

Type: Bug
Component: hotspot
Sub-Component: gc
Affected Version: 1.4.2,5.0u8

Priority: P3
Status: Resolved
Resolution: Fixed
OS: solaris_9,windows_2003
CPU: x86,sparc

Submitted: 2006-06-02
Updated: 2011-03-17
Resolved: 2006-11-14

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other	JDK 6	JDK 7	Other
1.4.2_14,hs10Fixed	6u1Fixed	7Fixed	hs10Fixed

Related Reports

Duplicate :	JDK-6413516 - 4 unusual very long ParNew GC
Relates :	JDK-6237967 - CMS: ability to tweak demand forecasting and coalition policy to control fragmentation
Relates :	JDK-6446077 - Long GC times after Concurrent Mode Failure
Relates :	JDK-6459113 - CMS+ParNew: wildly different ParNew pause times depending on heap shape caused by allocation spread

Description

Customer is using an ER release of 5.0U6
The ER is 1.5.0_06-erdist-2006-02-01. The bug addressed was 6367204

The hardware is:
       Dell 4xDual core Xeon
       32GB RAM
       10x RAID 10 HDD
   The OS is Server 2003.
   The java version and configuration:
       JRE v.1.5.0_06-erdist-20060201
       28GB heap
       GC options: -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:ParallelGCThreads=7 -XX:NewSize=128M -XX:MaxNewSize=128M
-XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
-XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10
-XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseLargePages
-XX:+DisableExplicitGC

   The application is a custom distributed database server based on
TCP/IP and Sleepycat DBJE

   The symptoms:
       After running smoothly for ~1-4 days straight with
constant but light load, the ParNew GC's jump from ~150 ms every 30
seconds to 5-20 seconds out of every 30 seconds.  The start of the
degenerate ParNew GCs seem to mostly (but not always) coincide with the
start of a new CMS mark phase.  The general pattern is to spend 20-90%
of the time in young GC, which eventually quiesces down to acceptable
levels after ~4 hours of GC pain (frequently to re-start after the next
CMS sweep).
       The load was constant and unvaried from our side, so we
don't see any application-level cause for the degenerate GC performance.


They ran a test with 5.0u8 and the problem seemed to be pushed out.
The time to failure went to 48 hours for the initial 5 second spike and
another day or so to hit the ~20 second spikes.

They were running with large pages so they ran a test without it and with their ER 5.0u6 and the problem seemed to have gone away, but returned many days later.
Turning off large pages seem to have also extended the running but eventually
they still see the problem

Comments

EVALUATION See suggested fix section. This bug fix should be back ported along with that for 6459113.
13-09-2006
SUGGESTED FIX The following fixes were put back to 7.0 (b2?). They will be ported to 6u1 and 5u10, and as appropriate 1.4.2_XX, for all of which appropriate subCR's will be created: Event: putback-to Parent workspace: /net/jano.sfbay/export/disk05/hotspot/ws/main/gc_baseline (jano.sfbay:/export/disk05/hotspot/ws/main/gc_baseline) Child workspace: /net/prt-web.sfbay/prt-workspaces/20060912150719.ysr.bot/workspace (prt-web:/net/prt-web.sfbay/prt-workspaces/20060912150719.ysr.bot/workspace) User: ysr Comment: --------------------------------------------------------- Job ID: 20060912150719.ysr.bot Original workspace: neeraja:/net/spot/scratch/ysr/bot Submitter: ysr Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2006/20060912150719.ysr.bot/ Webrev: http://analemma.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2006/20060912150719.ysr.bot/workspace/webrevs/webrev-2006.09.12/index.html Fixed 6433335: ParNewGC times spiking, eventually taking up 20 out of every 30 seconds Fixed 6459113: CMS+ParNew: wildly different ParNew pause times depending on heap shape caused by allocation spread Webrev: http://analemma.sfbay/net/spot/scratch/ysr/bot/webrev (a) 6433335: ParNew's chunking of the card-table makes use of block-start calls for chunk surgery prior to assignment. These calls are in addition to the usual ones where block-start calls are used to start the scanning of the first card of a range of dirty cards. The logarithmic BOT was expected to keep the cost of such calls, well, logarithmic, in the size of the block being described, i.e. O(log n), n the size of the block. Unfortunately, as implemented, the cost was not asymptotically logarithmic, but rather logarithmic up to n = 16^3 = 4K cards = 2 MB and linear i.e. O(n) for the balance. The problem in 6433335 was that under certain kinds of heap shapes and object sizes and lifetimes, the CMS heap could end up with large free blocks. The chunking of the card-table would then run into O(n) cost of block-start calls on these large free chunks. There are several orthogonal fixes to this problem, all of which are additive. This putback accomplishes the most major of those, which is to make the BOT asymptotically logarithmic. A consequence of that change, however, is that the "block-split" operation can now become potentially costlier. The previous asymptotic cost of a split was bounded by a constant. For moderate size blocks, the cost was O(p.log s), p+s=n, n the size of the block being split, p the size of the prefix of the split and s the size of the suffix. The actual cost was however dictated by a tight loop in which a BOT entry is first read and tested and then possibly modified, thus making for a large constant in the big-O. We modified the BOT fixup operation so that we do not need to do any BOT reads at all, just writes. This won back any potential loss from the new block-split cost which is asymptotically O(p.log s). This solved the customer's performance problem with no regression in performance elsewhere. (The main concern had been that the increased asymptotic cost of splitting would increase the variance in the ParNew times due to the large hit one might take as a result of splitting a large block to replenish the free lists. Too, the initial scavenges as well as perm gen allocation would be more adversely affected, causing an increase in start-up. Indeed fixing the logBOT problem did bring up these issues, but fixing the BOT fixup won back the loss in performance.) As we mentioned above in passing, there are further possible improvements, including reducing the block-start calls for ParNew chunking. These are deferred for later (under a different bug id we will be opening for the purpose). (b) 6459113: The problem here was that when we expand the old generation we were not taking care to coalesce the newly obtained space to the end of an existing previously co-terminal free block in the heap. Depending on allocation patterns and lifetimes this could potentially cause avoidable fragmentation, defeat CMS's coalition heuristics that try to maintain a large corpus of free storage in the upper part of the generation, and because of the presence of large free blocks lower in the address space, increase block-start costs for ParNew's card-table chunking. This also fixes 6446077 reported by Shutterfly and independently by a Java forum user as well. Thanks to Mark Susko for interfacing w/customer (General Dynamics), to Poonam Bajaj for help with 5uXX instrumentation for customer and verifying fix using test case from Shuttrefly, and to Stephen "cool-as-a-cucumber" Fitch and Mark Susko for keeping a hot escalation cool, allowing us the luxury of exploring all avenues of solution including several false starts at a constant time block split operation using a non-canonical BOT representation which was not to be. Reviewed by: Jon Masamitsu, Tony Printezis; Poonam Bajaj (partial) Fix Verified: yes Verification Testing: . 6433335: testing by the customer . 6459113: . test case from customer (Cinnober) attached to bug report . also independently tested using Shutterfly's test case (by Poonam) Other testing: +/- CMS +/- VerifyBlockOffsetArray . PRT . refworkload Performance testing: . refworkload (JBB with CMS+ParNew gives a nice 20% ParNew improvement and a 5% bottom-line improvement on Niagara) . Alacrity (no change) Files: update: src/share/vm/memory/blockOffsetTable.cpp update: src/share/vm/memory/blockOffsetTable.hpp update: src/share/vm/memory/compactibleFreeListSpace.cpp update: src/share/vm/memory/compactibleFreeListSpace.hpp Examined files: 3876 Contents Summary: 4 update 3872 no action (unchanged)
13-09-2006
SUGGESTED FIX . make BOT lookup asymptotically logarithmic . speed up splitting operation by avoiding read before write
07-09-2006
WORK AROUND -XX:-UseParNewGC appears to provide a temporary workaround for this customer for avoiding the long parnew pauses. The downside is that single-threaded scavenges are slightly longer than the "good case" when using ParNew. Your milage may vary, but you'll certainly avoid the long pauses you might otherwise see either because of very large objects or very large free blocks in the old generation.
26-06-2006
EVALUATION Instrumentation so far appears to implicate the presence of very large free blocks (but the same could happen with very large objects as well) in the CMS generation affecting adversely the card scanning times for scavenges presumably on account of the extremely frequent and slow block_start() calls.
26-06-2006