JDK-4239841 : 1.3: OutOfMemoryError thrown before finalizers run
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 1.3.0,1.3.1,1.4.0
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: generic,solaris_8
  • CPU: generic,sparc
  • Submitted: 1999-05-20
  • Updated: 2013-11-01
  • Resolved: 2001-09-27
Related Reports
Relates :  
Description

Name: dkC59003			Date: 05/20/99



When executing the test displayed below, HotSpot throws OutOfMemoryError 
due to poor performance of its garbage collector. Indeed, it likely should 
not throw OutOfMemoryError, when a small piece (~128Kb) of memory is requested, 
and there are tons (~40Mb) of unutilized garbage available in the heap. 

The following test tries to perform 50 cycles of allocation of all available 
memory and its deallocation. The hotspot's GC seems to fail to keep apace with 
actual memory operations, and the test usually fails after ~5th-15th cycle of 
memory allocation/deallocation. However, this test passes against classic VM 
(see details below). 

Note, that this test is inproper for compatibility testing; instead, it shows 
weakness of HotSpot's GC in comparision to classic GC:

Classic VM execution is Ok:
    >>>> java -classic memleak001
    # invocations depth: 407
    # invocations depth: 406
    # invocations depth: 407
    # invocations depth: 407
    . . .
    # invocations depth: 406
    # invocations depth: 405
    # invocations depth: 404
    The test has PASSED.

HotSpot execution fails:
    >>>> java memleak001
    # invocations depth: 477
    # invocations depth: 454
    # invocations depth: 23
    . . .
    # invocations depth: 454
    # invocations depth: 23
    # invocations depth: 477
    #
    # OOPS! Failed to allocate ~128Kb of memory while there
    #   are tons (~40Mb) of unutilized garbage in the heap.
    #
    The test has FAILED.

This test passes against JDK 1.0.2, 1.1.6, 1.2, 1.2.2, and 1.3 for Solaris/Sparc.

This test passes against JDK 1.2, 1.2.2, and 1.3 for win32, if executed without JIT, 
or both with and without JIT if executed with the following command-line:
    java -classic memleak001
However, the test fails against JDK 1.2, 1.2.2, and 1.3 for win32 having no HotSpot 
installed, if executed with JIT by the command-line:
    java memleak001

This test passes against JDK 1.1.6 for windows both with JIT and without JIT.
This test fails against JDK 1.0.2 for windows.

All latest versions of HotSpot fail on this test: 
    hotspot1_0rc1_hp-bin-solsparc-mar_12
    hotspot1_0fcs_hp-e-bin-win-apr_17
    hotspot2_0ea1_hp-a-bin-win-may_04
    hotspot_client1_3beta1_compiler1-d-bin-win-may_18

Computers used to execute the test:
1) SUNW Ultra-1, CPU UltraSPARC 167MHz, RAM 128Mb, SunOS 5.5.1
2) Pentium-II 350MHz, RAM 128Mb, Windows NT 4.0 SP3, IE 4.01 SP1

Below is the source of the test: 
    ${TESTBASE_JS}/src/nsk/stress/memory/memleak001/memleak001.java: 

/* %W% %E%
 * Copyright %D% Sun Microsystems, Inc.
 */

import java.io.PrintStream;

/**
 * This test starts recursive procedure allocating a series of large 
 * objects in order to exhaust JVM heap, and deletes all references 
 * to those objects except of references found in invocations stack. 
 *
 * <p>The test tries to provoke anticipatory finalization of those objects
 * by invoking System.runFinalization() when the recursive procedure 
 * stops to go into deeper self-invocations. The test is treated passed,
 * if there is no finalization occurs of the objects just allocated 
 * until that recursive procedure have return. (When that recursive 
 * procedure returns, references to those objects are poped out from 
 * the invocations stack, and their finalization becomes legal.)
 *
 * <p>This cycle of recursive memory allocation/deallocation is repeated 
 * 50 times in order to test quality of the garbage collector. The test 
 * must pass the first iteration of this cycle, but it may fail in later 
 * iterations due to OutOfMemoryError, because garbage collector may do 
 * not collect garbage after the first iteration of the cycle. 
 *
 * <p>Indeed, JVM specification does not demand garbage collector to really 
 * collect garbage, so that there may be no GC implemented at all! However, 
 * I believe that every JVM pretending to be high-quality VM have to avoid 
 * OutOfMemoryError when there are tons of garbage available in its heap. 
 *
 * @author Eugene I. Latkin
 */
public class memleak001 {
    static boolean testOver;	// prohibit finalize()
    static int level;		// shows recursion deeps
    
    static final int AMOUNT = 1 << 17;	// 128Kb per piece
    static final int LIMIT = 1 << 10;	// up to 128Mb total

    byte piece[];   // eat memory piece by piece
    
    memleak001 parent;
    memleak001 child = null;

    /**
     * Reference to <code>parent</code> should be used later 
     * to clean parent's reference to this child instance. 
     */
    public memleak001 (memleak001 parent, int amount) {
	this.parent = parent;
	piece = new byte [amount];
    };

    /**
     * Recursively incarnate a chain of <code>memleak001</code> 
     * instances while <code>limit</code>&gt;1 and there is enough 
     * memory to allocate new instances. Delete all references to 
     * the allocated objects except of references kept in the 
     * invocations stack, and try to provoke anticipatory
     * finalization of instances just incarnated. 
     */
    void chain (int limit) {
	memleak001.level++;
    
	if (parent != null)
	    parent.child = null;
	parent = null;
	
	try {
	    if (limit > 1) {
		int amount = piece.length;
		child = new memleak001(this,amount);
	    };
	} catch (OutOfMemoryError oome) {
	};

	if (child != null)	
	    try {
		child.chain(limit - 1);
	    } catch (StackOverflowError error) {
	    }
	else {
	    System.runFinalization();
	    testOver = true;
	};
    };
    
    /**
     * Reveal unexpected finalization.
     */
    public void finalize () {
	if (!testOver) {
	    testOver = true; // error is revealed
	    throw new MemLeak001Error(
		"Unexpected finalization of memleak001 instance, level=" 
		+ level);
	};
    };
    
    /**
     * May be thrown only by <code>finalize()</code> method.
     */
    static class MemLeak001Error extends Error {
	MemLeak001Error (String purpose) {
	    super(purpose);
	};
    };

    /**
     * Try to allocate a number of pieces of memory, so that 
     * the only reference to each of those pieces should be found 
     * in invocations stack. Assume the test is failed, if finalizer 
     * tries to utilize those pieces before invocations stack have 
     * poped-out references to them. 
     */    
    static int doRun (String args[], PrintStream out) {
	int amount = AMOUNT;
	if (args.length >= 1)
	    amount = Integer.parseInt(args[0]);
	int limit = LIMIT;
	if (args.length >= 2)
	    limit = Integer.parseInt(args[1]);

	level = 0;
	testOver = false;
	
	try {
	    new memleak001(null,amount).chain(limit);

	} catch (OutOfMemoryError oome) {
	    out.println("#");
	    out.println("# OOPS! Failed to allocate ~128Kb of memory while there");
	    out.println("#   are tons (~40Mb) of unutilized garbage in the heap.");
	    out.println("#");
	    return 2;

	} catch (MemLeak001Error ml1e) {
	    out.println("#");
	    out.println("# " + ml1e);
	    out.println("#");
	    return 2;
	};

	if (level != 1)
	    out.println("# invocations depth: " + level);

	return 0;
    };

    /**
     * Entry for JavaTest wrapper: return 0 if the test has passed,
     * and return 2 if the test has failed.
     */
    public static int run (String args[], PrintStream out) {
	int exitCode=0;

	for (int i=0; i<50; i++) {
	    exitCode = doRun(args,System.out);
	    if (exitCode != 0)
		break;
	    System.gc();
	};

	if (exitCode == 0)
	    System.out.println("The test has PASSED.");
	else
	    System.out.println("The test has FAILED.");

	return exitCode;
    };

    /**
     * Exit code is 95 if the test passed, and 97 if the test 
     * have failed (like JCK lang/vm tests).
     */
    public static void main (String args[]) {
	int exitCode = run(args,System.out);
	System.exit(exitCode + 95);
    };
}

======================================================================

Name: elR10090			Date: 03/22/2001



Alexey Gibadullin, ###@###.###


This bug also affects test from testbase_nsk:

    nsk/coverage/exceptionhandlertable/exceptionhandlertable001
    
being executed against 64-bit Merlin-b56 on Solaris Sparc in all modes.
    
======================================================================

Name: elR10090			Date: 09/26/2001


Eugene I. Latkin <###@###.###> 

The "official" name of the test memleak001 is: 
    nsk/stress/memory/memleak001

I added this info here to help the script, which every day 
generates lists of known bugs for each of the testbase_nsk 
tests. 

Unfortunately, that script cannot recognize the test name 
embraced with suffix and/or prefix as:
    ${TESTBASE_JS}/src/nsk/stress/memory/memleak001/memleak001.java

======================================================================

Comments
EVALUATION Steffen: This needs to get evaluated if it is a 1.1 or 2.0 bug. If 2.0, then lower priority. rene.schmidt@eng 1999-05-20 ---------------------------------------------------------------- This is not a P1 bug. The JLS does not specify when OutOfMemoryErrors are thrown, and there is no guarantee of prompt finalization. A generational collector is likely to finalize objects slower than a non-generational collector, since dead objects may stay around longer. We should look closer into this to see if we can mimic the classic VMs behavior more closely (for compatibility reasons), although this in unlikely to affect real user programs. steffen.grarup@eng 1999-06-25 ---------------------------------------------------------------- I just tried running the test again on HotSpot 1.0.1 and 1.3 client. It passed with both hotspot and classic, but failed for both of them with -verbose:gc was turned on. I suspect the reason is related to thread priorities (finalizers are run in separate thread). To be further investigated, but not a serious problem. steffen.grarup@eng 1999-08-03 ---------------------------------------------------------------- It seems really dubious to run the JVM totally out of memory and then expect something like the finalizer to run properly. The finalizer is just more Java code. so if you are out of memory, it is out of memory, too. I made the following change to a copy of memleak001.java void chain (int limit) { memleak001.level++; if (parent != null) parent.child = null; parent = null; + if (Runtime.getRuntime().freeMemory() < + (12*Runtime.getRuntime().totalMemory())/100) { + testOver = true; + return; + } try { if (limit > 1) { int amount = piece.length; child = new memleak001(this,amount); }; } catch (OutOfMemoryError oome) { }; to bail out before we run the memory system totally out, and that seems to run fine with JDK-1.4.0-beta_refresh-b68. The fraction seems pretty high, but remember that all those objects have to survive 2 collections before they are free memory again. (One to discover that the finalizer needs to run, and one after the finalizer runs to collect the storage. Looked at that way, 12% doesn't seem like a lot.) When run with -verbose:gc it shows an interesting periodicity: [GC 1501K->1413K(3520K), 0.0223238 secs] [GC 2310K->2310K(4464K), 0.0202821 secs] [GC 3847K->3847K(5968K), 0.0185366 secs] [GC 6408K->6408K(8528K), 0.0224432 secs] [GC 10634K->10634K(12800K), 0.0202142 secs] # invocations depth: 136 [Full GC 17551K->17551K(19904K), 0.0228133 secs] # invocations depth: 90 [Full GC 29078K->12939K(32904K), 0.0653611 secs] # invocations depth: 126 [Full GC 29077K->16270K(32904K), 0.0932736 secs] # invocations depth: 100 ... # invocations depth: 126 [Full GC 29078K->16270K(32904K), 0.0843049 secs] # invocations depth: 100 [Full GC 29078K->12940K(32904K), 0.0712777 secs] # invocations depth: 126 [Full GC 29078K->16270K(32904K), 0.0846060 secs] # invocations depth: 100 [Full GC 29078K->12940K(32904K), 0.0708868 secs] # invocations depth: 126 [Full GC 29078K->16270K(32904K), 0.0874868 secs] # invocations depth: 100 [Full GC 29078K->12940K(32904K), 0.0728164 secs] The test has PASSED. as (I think) storage moves from being referenced from the stack to being referenced from the finalizer queue, and then collected, leaving that storage available for the next pass. I've seen various other pairs of numbers (342, 112; 332, 122; 264, 190, 417, 37): there's a pattern there) What I don't understand is why if I lower the threshold to 11%, I get[GC 1501K->1413K(3520K), 0.0224108 secs] [GC 2310K->2310K(4464K), 0.0173783 secs] [GC 3847K->3847K(5968K), 0.0215580 secs] [GC 6408K->6408K(8528K), 0.0244110 secs] [GC 10634K->10634K(12800K), 0.0223132 secs] [GC 17678K->17678K(19904K), 0.0238568 secs] [GC 29461K->29461K(33120K), 0.0271127 secs] [GC 49055K->49055K(55184K), 0.0307242 secs] # invocations depth: 454 [Full GC 58278K->58277K(65344K), 0.0251140 secs] [GC 58277K->58277K(65344K), 0.0235173 secs] [Full GC 58277K->58277K(65344K), 0.0245315 secs] # # OOPS! Failed to allocate ~128Kb of memory while there # are tons (~40Mb) of unutilized garbage in the heap. # The test has FAILED. where it expands the heap out to nearly the full size (64MB) and then collection doesn't make any progress. Compare that to the 12% run, which bottoms out (throws OutOfMemoryException) at 17MB and then has enough room to continue to work (with twice as much memory for the live objects and the finalizable ones?). And sometimes it runs just fine if I have -XX:+PrintGCDetails or -verbose:gc on, and sometimes it fails with that option. It looks like a thread scheduling problem of the mutator versus the finalizer thread. The test has a bug in it too: the finalize() method throws an exception if the test isn't over, but the test isn't marked as over until well after the finalizer might get run (on OutOfMemory), or well after the test is marked as not over because we have started the next round. Oh, and that will set testOver to true even if we are in the middle of the next round but the finalize() method got delayed. Further, the exception is thrown from the finalize() to the finalizer thread's run() method, not back to the point where you are trying to catch it. That's broken. So, this is an okay stress test, but I'm not sure what it's trying to show. peter.kessler@Eng 2001-06-21 ---------------------------------------------------------------- Name: elR10090 Date: 06/22/2001 ================================================================== Eugene I. Latkin <###@###.###>, 2001-jun-23 Peter, I'm the test author; here is my clarification about what the test is trying to show. When I written a series of memory-stress tests, my task (stipulated by Satya Dodda) was to catch regression in VM quality, i.e: to reveal situations when Classic VM passes but HotSpot fails. This test just shows such bug: JDK 1.2.2 passes the test, but JDK 1.3 and 1.4 fail: >>>> /home/latkin/jdk1.2.2/bin/java -version java version "1.2.2" Classic VM (build JDK-1.2.2-W, green threads, sunwjit) >>>> time /home/latkin/jdk1.2.2/bin/java memleak001 ; echo $status # invocations depth: 407 # invocations depth: 408 # invocations depth: 404 . . . # invocations depth: 406 # invocations depth: 407 # invocations depth: 404 The test has PASSED. 41.0u 4.0s 0:46 96% 0+0k 0+0io 0pf+0w 95 >>>> time /home/latkin/jdk1.3/bin/java -showversion memleak001 ; echo $status java version "1.3.0" Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0) Java HotSpot(TM) Client VM (build 1.3.0, mixed mode) # invocations depth: 508 # # OOPS! Failed to allocate ~128Kb of memory while there # are tons (~40Mb) of unutilized garbage in the heap. # The test has FAILED. 1.0u 1.0s 0:06 33% 0+0k 0+0io 0pf+0w 97 >>>> time /home/latkin/jdk1.4/bin/java -showversion memleak001 ; echo $status java version "1.4.0-beta_refresh" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta_refresh-b67) Java HotSpot(TM) Client VM (build 1.4.0-beta_refresh-b67, mixed mode) # invocations depth: 454 # # OOPS! Failed to allocate ~128Kb of memory while there # are tons (~40Mb) of unutilized garbage in the heap. # The test has FAILED. 2.0u 0.0s 0:06 32% 0+0k 0+0io 0pf+0w 97 Please, let me know if you need (or want) further details. Thanks. Eugene Latkin ================================================================== I understand this test program much better now, and I think I know what's going on. It's still the case that the test isn't testing what it means to test: that calling System.gc() includes running all the finalize() methods and freeing up the space. If Classic VM (build JDK-1.2.2-W, green threads, sunwjit) used to do that, it is welcome to, but doing that has other problems: mostly latency and responsiveness but also a false sense of the semantics of System.gc(). I modified the memleak001.java file to gather some runtime statistics on the state of memory at various interesting points in the program. My modified program is attached, as memleakpbk4.java. If you want to modify the file, be careful not to do any allocations (or println's or anything that will cause allocations). Also you can't upset the timing too much, because that's crucial, too. Here's what I think is happening: in one pass of doRun, we run the VM out of memory, catch OutOfMemoryError in chain() and return happily. Then run() calls System.gc(). That discovers a lot of finalizable objects, and queues them to have their finalize() methods run. Then we are back around the loop in run() for the next call to doRun() to run the VM out of memory. How much "free" memory the VM has for that next call of doRun() depends on how fast the finalizer queue calls the finalize() method, and how fast the finalize() method is. Internally, when the VM can't trivially satisfy an allocation, it will force a garbage collection, which will recover any objects whose finalize() methods have been called (and returned). Thus we do many collections and usually recover all the space hanging on the finalizer queue in time for the next pass of doRun(). But if the finalizer process gets behind for some reason, or the doRun() process gets ahead for some reason, then the System.gc() at the bottom of the loop will have transferred all of the memory to the finalizer queue, so when we get to allocating the object in doRun(), there isn't any free memory, so we throw an OutOfMemoryError and the test thinks it has failed. The typical failing output of my version is % java memleakpbk4 # invocations depth: 507 doRun_before oome: false free: 3333152 pending: 0 memleakpbk4 oome: true free: 280496 pending: 508 chain oome: true free: 280496 pending: 508 doRun_after oome: false free: 279328 pending: 508 # invocations depth: 507 doRun_before oome: false free: 280480 pending: 508 memleakpbk4 oome: true free: 280480 pending: 508 chain oome: true free: 280480 pending: 508 doRun_after oome: false free: 280000 pending: 508 # invocations depth: 507 (1) doRun_before oome: false free: 280480 pending: 508 (2) memleakpbk4 oome: true free: 280480 pending: 508 (3) chain oome: true free: 280480 pending: 508 (4) doRun_after oome: false free: 280000 pending: 508 # invocations depth: 507 doRun_before oome: false free: 280480 pending: 508 memleakpbk4 oome: true free: 280480 pending: 508 chain oome: true free: 280480 pending: 508 doRun_after oome: false free: 280000 pending: 508 (5) # invocations depth: 508 doRun_before oome: false free: 280480 pending: 508 (6) memleakpbk4 oome: true free: 149336 pending: 509 chain oome: true free: 149336 pending: 509 (7) doRun_after oome: false free: 148856 pending: 509 doRun catches OutOfMemoryErrorab # # failed when (8) doRun_before oome: false free: 149336 pending: 509 memleakpbk4 oome: true free: 149336 pending: 509 chain oome: false free: 149336 pending: 509 doRun_after oome: true free: 149272 pending: 64 Try sleeping ... after sleeping oome: false free: 144752 pending: 1 Call System.gc() and sleep ... after gc oome: false free: 66770376 pending: 0 The test has FAILED. in which you can see at line (1) that we enter the top of doRun() with 508 objects pending on the finalizer queue (they've been allocated but haven't had their finalize() method called). Then everything runs in a balanced enough way so that by the time memleakpbk4 runs out of memory we have run the finalizers and collected the previous run's objects, but now we have 508 more objects that have been allocated but not finalized (because we still have references to them!). So chain() gets an OutOfMemoryError and everything is fine according to the test. But then at line (5) you can see that because of alignment or promotion or something, we've found room for 508 rather than the usual 507 depth. That means that at line (6), we are heading out of doRun() with 509 finalizable objects, and maybe more importantly, at line (7) we are exiting doRun() with less free space than we "usually" have. So we head into doRun() at line (8) with not enough free space to allocate the first object. All the space is on the finalization queue, but the finalizer doesn't get a chance to run before we declare ourselves unable to allocate that object, so doRun() catches the OutOfMemoryError and the test thinks it has failed. (A problem with my output is that though it says, e.g., "free: 148856", that doesn't mean that there are 148856 contiguous bytes available, because the free space is likely in smaller chunks in the various generations of the collector.) I'm running on an Ultra60-450x2, that is, with 2 CPUs. Even though the finalizer thread should be at a higher priority than the main Java thread, the main Java thread will continue to run (that is, continue to request storage possibly generating errors) while the finalizer thread is running. The test runs the VM out of memory on one pass and is depending on the finalizer thread to have gotten through at least some of the finalizable objects before the beginning of the next pass. That's not guaranteed. ###@###.### 2001-09-27
27-09-2001