JDK-6330863 : vm/gc/InfiniteList.java fails intermittently due to timeout problems
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 5.0u7,6,6u3,6u4,6u4p,7
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS:
    generic,solaris,solaris_10,windows,windows_2003 generic,solaris,solaris_10,windows,windows_2003
  • CPU: generic,x86,sparc
  • Submitted: 2005-09-30
  • Updated: 2012-10-01
  • Resolved: 2012-03-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 Other
7u4Fixed 8Fixed hs23Fixed
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Problem Description 	: vm/gc/InfiniteList.java fails intermittently due to timeout problems

Tested_Java_Release   	: 1.6.0
Tested_Build          	: B52
Operating System	: Linux, Solaris X86, Windows
Test cases		: vm/gc/InfiniteList.java
Results Location      	:  http://vmsqe.sfbay/nightly/TL/results/1.6.0-auto-270/ServerVM/64BITLINUX-AMD64/mixed/SERVICE/VMJTREG_REGRESSION-TLNIGHTLY-SERVICE-ServerVM-mixed-64BITLINUX-AMD64-2005-09-28-00-51-23/analysis.html
Error Message		: See Below

result: Failed. Execution failed: Program `/var/tmp/fhsu/Work/JDK/TLNIGHTLY/SERVICE/solaris-sparc/bin/java' interrupted! (timed out?)

test result: Failed. Execution failed: Program `/var/tmp/fhsu/Work/JDK/TLNIGHTLY/SERVICE/solaris-sparc/bin/java' interrupted! (timed out?)

EVALUATION http://hg.openjdk.java.net/lambda/lambda/hotspot/rev/23c0eb012d6f

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-comp/hotspot/rev/23c0eb012d6f

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/23c0eb012d6f

WORK AROUND 1. Add the command line option -XX:GCHeapFreeLimit=<N> Here <N> is a percentage from 0-100 that defines the maximum percentage of free space in the heap allowed when throwing an OutOfMemoryError because the JVM is spending "too much" time in GC. Recommend value of <N> to workaround this bug is in the range 10-40; the exact value is dependent on the relative sizes of the young and old generation. Note that this simply causes the JVM to throw an OutOfMemoryError earlier than it currently does because the GC overhead is too high; it doesn't change the way the collector behaves. 2. Reduce the value of PromotedPadding to 1 or 0 (the default value is 3). Use the command line option -XX:PromotedPadding=1 PromotedPadding is used when deciding whether to perform a Young GC or a Full GC. A smaller value may allow Young GCs to be attempted instead of Full GCs as the heap becomes more full. Both the above are only partial workarounds; they will not solve the problem for all applications.

EVALUATION I think this is at least partially due to the way parallel compaction deals with the young gen. When copying from a young gen space (e.g., eden or from) to the old gen, *all* live objects in the space must fit into the old gen; otherwise, none of them will be copied. This is because par compaction operates on 'regions' of the heap and copying only part of a space would require extra bookkeeping (or some good luck). A scavenge, if it were done, would be able to promote some things into the old gen. Even though scavenge before full gc is enabled, the scavenge code bails out because the estimated amount of bytes promoted is > the amount of free space in the old gen (in PSScavenge::should_attempt_scavenge()). I expected the gc overhead limit to kick in, since we are spending nearly all our time in GC and not collecting anything. We should get an OOME because we are spending too much time in GC. Update after a conversation with the implemtor of the overhead limit: the overhead limit is not triggered because there is too much free space in the old generation (more than 2%). Could modify the policy so that when the estimated bytes to promote is > the free space in the old gen, a scavenge is skipped only if the last scavenge was not skipped for the same reason. In other (simpler) words, don't skip consecutive scavenges.