JDK-8141596 : java/util/concurrent/ScheduledThreadPoolExecutor/GCRetention.java starts failing intermittently
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-11-06
  • Updated: 2019-06-07
  • Resolved: 2017-03-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 9
10Fixed 9 b160Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Description
java/util/concurrent/ScheduledThreadPoolExecutor/GCRetention.java starts failing intermittently recently (after JDK-8134853), very rare though.

----------System.out:(3/28)----------

Passed = 0, failed = 1

----------System.err:(23/1323)----------
java.lang.Error: references to 1/100 tasks retained ("leaked")
	at GCRetention.test(GCRetention.java:116)
	at GCRetention.instanceMain(GCRetention.java:133)
	at GCRetention.main(GCRetention.java:131)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:520)
	at com.sun.javatest.regtest.agent.MainActionHelper$SameVMRunnable.run(MainActionHelper.java:218)
	at java.lang.Thread.run(Thread.java:747)
java.lang.AssertionError: Some tests failed
	at GCRetention.instanceMain(GCRetention.java:135)
	at GCRetention.main(GCRetention.java:131)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:520)
	at com.sun.javatest.regtest.agent.MainActionHelper$SameVMRunnable.run(MainActionHelper.java:218)
	at java.lang.Thread.run(Thread.java:747)



Comments
The rampdown 1 page for JDK 9 (http://openjdk.java.net/projects/jdk9/rdp-1) states "P4-P5 bugs should, in general, be left to future releases unless they only affect documentation or tests, in which case they should be identified as such with the noreg-doc, noreg-demo, or noreg-self labels, respectively." As this change only affects a tests and we prefer the tests to pass reliably, IMO this fix is in-bounds for this point in JDK 9.
24-02-2017

I believe that test stabilization bug fixes should be fine to be pushed to jdk9/dev.
24-02-2017

As a P4 this can't go into 9 now we are in RDP1 (and soon RDP2!).
22-02-2017

http://cr.openjdk.java.net/~martin/webrevs/openjdk9/jsr166-jdk9-integration/GCRetention/
21-02-2017

Here's yet another attempt at solving the GC retention testing problem. Using a reference queue gives us something to wait for without having to poll with sleeps. We can avoid having possible retention from locals by moving new statements into the task constructor. I have no proof, but I'm going to optimistically claim this fixes the bug. void removeAll(ReferenceQueue<?> q, int n) throws InterruptedException { for (int j = n; j--> 0; ) { if (q.poll() == null) { for (;;) { System.gc(); if (q.remove(1000) != null) break; System.out.printf( "%d/%d unqueued references remaining%n", j, n); } } } check(q.poll() == null); } void test(String[] args) throws Throwable { final CustomPool pool = new CustomPool(10); final int size = 100; final ReferenceQueue<Object> q = new ReferenceQueue<>(); final List<WeakReference<?>> refs = new ArrayList<>(size); final List<Future<?>> futures = new ArrayList<>(size); // Schedule custom tasks with strong references. class Task implements Runnable { final Object x; Task() { refs.add(new WeakReference<>(x = new Object(), q)); } public void run() { System.out.println(x); } } // Give tasks added later earlier expiration, to ensure // multiple residents of queue head. for (int i = size; i--> 0; ) futures.add(pool.schedule(new Task(), i + 1, TimeUnit.MINUTES)); futures.forEach(future -> future.cancel(false)); futures.clear(); pool.purge(); removeAll(q, size); for (WeakReference<?> ref : refs) check(ref.get() == null); pool.shutdown(); // rely on test harness to handle timeout pool.awaitTermination(1L, TimeUnit.DAYS); }
21-02-2017

That might help this problem in that the loss of the "final Object x" may ensure the strong reference to x is not hoisted out of the loop, thus allowing it to remain live longer than intended. The explicit nulling of x in the loop may also help on its own. Of course I'm still speculating on the failure mode. As I can't reproduce it I can't investigate it.
21-02-2017

Another attempt that is unlikely to help, but is good hygiene for this sort of test. --- GCRetention.java 20 Feb 2017 23:26:14 -0000 1.5 +++ GCRetention.java 21 Feb 2017 00:28:36 -0000 1.7 @@ -61,17 +61,21 @@ final int size = 100; WeakReference<?>[] refs = new WeakReference<?>[size]; Future<?>[] futures = new Future<?>[size]; + class UseX implements Runnable { + final Object x; + UseX(Object x) { this.x = x; } + public void run() { System.out.println(x); } + } for (int i = 0; i < size; i++) { - final Object x = new Object(); + Object x = new Object(); refs[i] = new WeakReference<Object>(x); - // Create a Runnable with a strong ref to x. - Runnable r = new Runnable() { - public void run() { System.out.println(x); } - }; + // Schedule a custom task with a strong reference to r. // Later tasks have earlier expiration, to ensure multiple // residents of queue head. - futures[i] = pool.schedule(r, size*2-i, TimeUnit.MINUTES); + futures[i] = pool.schedule(new UseX(x), + size*2-i, TimeUnit.MINUTES); + x = null; // inhibit strong ref on the stack } Thread.sleep(10); for (int i = 0; i < size; i++) {
21-02-2017

I'm looking at GCRetention.java. Next revision will have the below, which will let jtreg catch pool termination timeouts as done elsewhere. (But probably won't help this problem) --- ScheduledThreadPoolExecutor/GCRetention.java 27 Feb 2016 21:15:57 -0000 1.4 +++ ScheduledThreadPoolExecutor/GCRetention.java 20 Feb 2017 23:21:08 -0000 @@ -7,11 +7,8 @@ /* * @test * @summary Ensure that waiting pool threads don't retain refs to tasks. - * @library /lib/testlibrary/ */ -import static java.util.concurrent.TimeUnit.MILLISECONDS; - import java.lang.ref.WeakReference; import java.util.concurrent.Delayed; import java.util.concurrent.ExecutionException; @@ -20,10 +17,8 @@ import java.util.concurrent.ScheduledThreadPoolExecutor; import java.util.concurrent.TimeUnit; import java.util.concurrent.TimeoutException; -import jdk.testlibrary.Utils; public class GCRetention { - static final long LONG_DELAY_MS = Utils.adjustTimeout(10_000); /** * A custom thread pool with a custom RunnableScheduledFuture, for the @@ -95,7 +90,7 @@ Thread.sleep(10); } pool.shutdown(); - pool.awaitTermination(LONG_DELAY_MS, MILLISECONDS); + pool.awaitTermination(1L, TimeUnit.DAYS); if (cleared < size) throw new Error(String.format ("references to %d/%d tasks retained (\"leaked\")",
20-02-2017

I've been unable to reproduce the failure on linux x64.
20-02-2017

Suggestion has been made to mark the test: @key intermittent
19-02-2017

Hi Martin, The failure has been seen on Windows and Linux. There don't seem to be any special execution flags being used. Failure is always the same: java.lang.Error: references to 1/100 tasks retained ("leaked") I would have to suspect that the reference to x in the creation loop: for (int i = 0; i < size; i++) { final Object x = new Object(); refs[i] = new WeakReference<Object>(x); can live longer than would appear from the code structure. I suspect the last Object created is still strongly reachable in some cases - and we may be talking 1 in 1000 runs. I don't know what exactly would impact the behaviour here.
19-02-2017

Thanks, David. Given that I can't investigate, should someone else be Assignee?
19-02-2017

Test failures are being seen again - hence this was reopened. Unfortunately the comments describing the failures are Confidential.
19-02-2017

Why was this bug reopened without comment?
19-02-2017

This should have been resolved in JDK-8142441 (b96) and JDK-8150523 (b109), closing as dup. There's no other open issue and no failure reported. In same binaries run, the last time we saw this test failure was in b91.
26-05-2016