JDK-8200347 : CompletableFuture join hangs forever while all dependents are in completed state
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util.concurrent
  • Affected Version: 8u161
  • Priority: P4
  • Status: Closed
  • Resolution: Cannot Reproduce
  • OS: generic
  • CPU: x86_64
  • Submitted: 2018-03-27
  • Updated: 2018-04-16
  • Resolved: 2018-03-28
Related Reports
Duplicate :  
Description
FULL PRODUCT VERSION :
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
produced on Microsoft Windows [Version 10.0.16299.309]
also seen on windows 7 and linux

A DESCRIPTION OF THE PROBLEM :
If you create complex dependency tree of completable futures it can sometimes fail to complete all dependents nevertheless all initial completable futures are done. Please use source code as example of such dependency tree. I tried to make it as small as possible but still reproducible in reasonable time.

After playing a while with this example I found out that it hangs in folowing case:
say we have:
CompletableFuture A,B,C,D,E
A->B,C   (B and C are dependents on A)
B->D,E   (D and E are dependents on B)

Methods A.postComplete() and B.cleanStack() can be run in parallel due to complex tree hierarchy. Method A.postComplete at some stage pushes B's dependents into A's stack, while B.cleanStack() tries to clean B's stack. Both methods are modifying 'next' pointers of B's dependents stack. This leads to A's stack can be corrupted and C will be lost from stack and never triggered. 

Possible solutions:
I think we can either remove cleanStack method and check for isLive in postComplete, or wrap nodes from B while pushing into A's stack.




STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
please run attached source code, it could take different number of iterations get blocked.
exmaple 1
...
..104861
..104862
..104863
..104864
..104865
..104866
..104867
..104868
..104869
..104870
..104871
..104872
--------------- and hang forever -----------------

example 2
...
..159161
..159162
..159163
..159164
..159165
..159166
..159167
..159168
..159169
..159170
..159171
..159172
..159173
..159174
---------------- hang ---------------

example 3
...
..277692
..277693
..277694
..277695
..277696
..277697
..277698
..277699
..277700
..277701
..277702
----------------- hang ----------

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
program should end after making all iterations in for loop
ACTUAL -
program is hang after some iterations

REPRODUCIBILITY :
This bug can be reproduced rarely.

---------- BEGIN SOURCE ----------
import java.util.ArrayList;
import java.util.concurrent.*;

public class Main {
    public static void main(String... args) {
        test();
    }

    private static void safeRandomSleep(int i) {
        try {
            Thread.sleep(ThreadLocalRandom.current().nextInt(i));
        } catch (InterruptedException e) {

        }
    }

    private static void test() {
        ExecutorService executor = Executors.newFixedThreadPool(5);

        for (int i = 0; i < 10_000_000; i++) {

            CompletableFuture<String> f1 = CompletableFuture.supplyAsync(() -> {
                safeRandomSleep(10);
                return "S";
            }, executor);

            CompletableFuture<String> f2 = CompletableFuture.supplyAsync(() -> {
                safeRandomSleep(10);
                throw new RuntimeException("123");
            }, executor);

            ArrayList<CompletableFuture<?>> ff = new ArrayList<>();
            ff.add(f1);
            ff.add(f2);

            for (int j = 0; j < 3; j++) {
                ff.add(f1.thenCombineAsync(f2, (s1, s2) -> s1 + s2, executor));
                ff.add(f1.thenApplyAsync((s) -> s, executor));
            }

            CompletableFuture[] ff1 = new CompletableFuture[8];
            ff1[0] = ff.get(4);
            ff1[1] = ff.get(7);
            ff1[2] = ff.get(6);
            ff1[3] = ff.get(3);
            ff1[4] = ff.get(2);
            ff1[5] = ff.get(1);
            ff1[6] = ff.get(0);
            ff1[7] = ff.get(5);

            CompletableFuture<Void> all = CompletableFuture.allOf(ff1);
            all.whenComplete((aVoid, throwable) -> System.out.print("."));

            safeRandomSleep(10);

            CompletableFuture[] ff2 = new CompletableFuture[8];
            ff2[0] = ff.get(6);
            ff2[1] = ff.get(0);
            ff2[2] = ff.get(5);
            ff2[3] = ff.get(7);
            ff2[4] = ff.get(2);
            ff2[5] = ff.get(1);
            ff2[6] = ff.get(3);
            ff2[7] = ff.get(4);

            CompletableFuture<Void> all2 = CompletableFuture.allOf(ff2);
            all2.whenComplete((aVoid, throwable) -> System.out.print("."));

            try {
                CompletableFuture<Void> a = all.thenApplyAsync(s -> s, executor);
                CompletableFuture<Void> b = all2.thenApplyAsync(s -> s, executor);
                a.join();
                b.join();
            } catch (Exception e) {
            }
            System.out.println(i);
        }

    }

}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Don't use complex dependents tree base on CompletableFuture features.


Comments
The test passes in the latest GA version. Closing as cannot reproduce.
28-03-2018

To reproduce the issue, run the attached test case. JDK 8u161- Fail JDK 8u172-ea - Fail JDK -10 +46 - Pass Output on JDK 8u161: .... .... ..12718 ..12719 ..12720 ..12721 ..12722 ..12723 ..12724 program hangs after this.
28-03-2018