JDK-8177443 : exception thrown from Arrays.sort comparator might lose elements
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util:collections
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2017-03-20
  • Updated: 2020-06-03
Description
FULL PRODUCT VERSION :
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)


ADDITIONAL OS VERSION INFORMATION :
Ubuntu 16.04 x64 - Linux james-ubuntu 4.4.0-66-generic #87-Ubuntu SMP Fri Mar 3 15:29:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Microsoft Windows [Version 10.0.14393]


A DESCRIPTION OF THE PROBLEM :
After calling Collections.sort the collection may contain different elements. This occurs if the Comparator used throws an exception while TimSort is performing a merge (an example stack is output by the example code).

This cause the merge to fail and results in elements in the collection being duplicated while others are lost. The length of the collection is correctly maintained.

This was found while investigating a test failure in Jython. More details can be found on the ticket http://bugs.jython.org/issue2399

REGRESSION.  Last worked in version 7u80

ADDITIONAL REGRESSION INFORMATION: 
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the example code provided

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The example code should produce the following output:

After sort distinct elements: 128

ACTUAL -
The example code produces the following output:

After sort distinct elements: 126

Indicating elements have been lost while others are duplicated.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashSet;
import java.util.List;

public class TestSort {

    private static class Complains {
        public final int i;
        
        public Complains(int i) {
            this.i = i;
        }

        @Override
        public String toString() {
            return "Complains(" + i + ")";
        }
    }

    private static class ComplainsComparator implements Comparator<Complains> {
        @Override
        public int compare(Complains o1, Complains o2) {
            // Values chosen to fail when TimSort is in the mergeLo function
            if (o1.i == 30 && o2.i == 43) {
                System.out.println("Throwing exception");
                throw new RuntimeException();
            }
            return o1.i < o2.i ? -1 : 1;
        }
    }

    public static void main(String[] args) {
        List<Complains> complainsList= new ArrayList<>(128);

        int[] nums = new int[]{50, 84, 104, 51, 75, 15, 66, 43, 17, 70, 74, 60, 113, 91, 71, 56, 76, 24, 123, 30, 28, 69, 96, 58, 109, 36, 32, 57, 79, 46, 67, 99, 72, 93, 101, 63, 126, 48, 65, 10, 78, 12, 119, 115, 53, 11, 33, 87, 114, 118, 44, 3, 100, 19, 27, 122, 121, 39, 25, 22, 2, 54, 16, 52, 124, 61, 13, 88, 0, 47, 40, 116, 98, 81, 6, 102, 35, 31, 73, 105, 77, 5, 23, 20, 7, 117, 107, 106, 42, 1, 21, 29, 8, 82, 95, 111, 14, 26, 85, 108, 97, 45, 80, 38, 64, 41, 9, 59, 62, 127, 110, 125, 90, 92, 120, 34, 83, 4, 49, 103, 68, 94, 55, 112, 18, 37, 86, 89};

        for (int i : nums) {
            complainsList.add(new Complains(i));
        }

        System.out.println("Before sort:" + complainsList);
        System.out.println("Before sort list size: " + complainsList.size());
        System.out.println("Before sort distinct elements: " + new HashSet<>(complainsList).size());

        try {
            System.out.println("Do sort");
            Collections.sort(complainsList, new ComplainsComparator());
        }
        catch (Exception e) {
            e.printStackTrace();
        }

        System.out.println("After sort" + complainsList);
        System.out.println("After sort list size: " + complainsList.size());
        System.out.println("After sort distinct elements: " + new HashSet<>(complainsList).size());
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
You can disable TimSort by setting the system property -Djava.util.Arrays.useLegacyMergeSort=true


Comments
A somewhat related issue is whether somebody inspecting the contents of the array (or list), such as the comparator, has any reasonable expectation that the elements in the array or list are the same ones that were originally in the array or list. The answer is NO, because elements are copied out to temporary arrays and then merged back in, resulting in temporary missing or duplicate elements, and this is visible from the comparator. See for example: https://stackoverflow.com/questions/62154855/comparison-method-violates-its-general-contract-elements-duplicated-in-collect It might be reasonable to document this limitation (as well as the one above) in the implementation notes for List.sort().
03-06-2020

This isn't really a regression, since it concerns the system's behavior under conditions that shouldn't occur, namely a comparator throwing an exception. It's really more of a quality-of-service issue in dealing with an error condition. It's never been specified that the contents of an array (or list) would be preserved when a comparator throws an exception. Arrays.sort() has always had the possibility that it could drop elements if the comparator it's provided throws an exception. The program below demonstrates this; it will drop elements 2-4 times out of every ten runs. This is true of both TimSort and the "legacy merge sort". Thus setting the property java.util.Arrays.useLegacyMergeSort isn't a workaround for this. Collections.sort() would leave the list's contents unchanged if its comparator throws, but this was mostly a coincidence of implementation. Prior to 8u20, it would copy the list's contents to a temporary array, sort that array, and then copy it back. In 8u20, the implementation was optimized so that ArrayList would sort in-place. This is a performance improvement, but Collections.sort() or List.sort() on an ArrayList now can potentially drop elements in this case where they wouldn't have before. It might be reasonable for Jython to copy the list into a temp array before sorting, since that's what Collections.sort() used to do. Presumably the old performance was acceptable, although it is a bit disappointing to leave the in-place optimization for ArrayList on the table. This might be a reasonable tradeoff, though, if preserving the array's contents is necessary. It might be possible to restore the "missing" elements in certain circumstances. I'm thus changing this to an RFE. ========== // code style suitable for compiling with Java 6 public class SortComparatorException { static final int N = 100000; static final Random random = new Random(); static final Comparator<Integer> cmp = new Comparator<Integer>() { public int compare(Integer i1, Integer i2) { if (random.nextInt(1000) == 0) { throw new IllegalArgumentException(); } else { return i1 < i2 ? -1 : i1 > i2 ? 1 : 0; } } }; static int doSort() { Integer[] array = new Integer[N]; for (int i = 0; i < N; i++) array[i] = i; Collections.shuffle(Arrays.asList(array), random); try { Arrays.sort(array, cmp); } catch (IllegalArgumentException e) { } return new HashSet<Integer>(Arrays.asList(array)).size(); } public static void main(String[] args) { System.setProperty("java.util.Arrays.useLegacyMergeSort", "true"); for (int i = 0; i < 10; i++) System.out.println(doSort()); } }
25-03-2017

To reproduce the issue, run the attached test case. Following are the results : JDK 7u80 - Pass JDK 8u11 - Pass JDK 8u20 - Fail JDK 8u121 - Fail JDK 9-ea + 161 - Fail Following is the extract of output on failed versions : Before sort list size: 128 Before sort distinct elements: 128 Do sort Throwing exception After sort list size: 128 After sort distinct elements: 126
23-03-2017