JDK-8072909 : TimSort fails with ArrayIndexOutOfBoundsException on worst case long arrays
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util
  • Affected Version: 7u76,8,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2015-02-06
  • Updated: 2015-09-29
  • Resolved: 2015-02-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 JDK 9
7u85Fixed 8u60Fixed 9 b51Fixed
Related Reports
Blocks :  
Relates :  
Relates :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.7.0_75"
OpenJDK Runtime Environment (IcedTea 2.5.4) (7u75-2.5.4-1~precise1)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)


ADDITIONAL OS VERSION INFORMATION :
Every OS, since this is portable Java code.

A DESCRIPTION OF THE PROBLEM :
As previously reported in bug 8011944, the function mergeCollapse in the TimSort class does not preserve the invariant runLen[i] > runLen[i+1]+runLen[i+2].  This leads to an ArrayIndexOutOfBoundsException in TimSort's pushRun method. We refer to 8011944 for more details.

The bug was marked as fixed, after an update in which the allocated length of runLen, as declared in the constructor, was increased. However, we found that the fix does not suffice, see the attached test case. 
In particular, the declared length 40 of runLen which is chosen if the length of the input array is >= 119151, does not suffice.

We provide a full analysis of the bug in a paper, which is available at:
  http://envisage-project.eu/?page_id=1412

REGRESSION.  Last worked in version 6u43

ADDITIONAL REGRESSION INFORMATION: 
Last worked  in version 6u31, according to bug report 8011944

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached test case as follows:
  java TestTimSort 67108864              (<=changed from paper to correct value)
or
  java -Xmx8G TestTimSort 1073741824 

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
No Exception occurs during sorting.
ACTUAL -
ArrayIndexOutOfBoundsException

ERROR MESSAGES/STACK TRACES THAT OCCUR :
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 40
	at java.util.TimSort.pushRun(TimSort.java:386)
	at java.util.TimSort.sort(TimSort.java:213)
	at java.util.TimSort.sort(TimSort.java:173)
	at java.util.Arrays.sort(Arrays.java:659)
	at TestTimSort.main(TestTimSort.java:18)


REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.*;

public class TestTimSort {
    private static final int MIN_MERGE = 32;
    private static final Comparator<Object> NATURAL_ORDER =
        new Comparator<Object>() {
            @SuppressWarnings("unchecked")
            public int compare(Object first, Object second) {
                return ((Comparable<Object>)first).compareTo(second);
            }
        };
    
    private final int minRun;
	private final List<Long> runs = new ArrayList<Long>();

    public static void main(String[] args) {
        int length = Integer.parseInt(args[0]);
        Arrays.sort(JDKWorstCase(length), NATURAL_ORDER);
    }
    
    private TestTimSort(int len) {
    	minRun = minRunLength(len);
    }

    private static int minRunLength(int n) {
        assert n >= 0;
        int r = 0;      // Becomes 1 if any 1 bits are shifted off
        while (n >= MIN_MERGE) {
            r |= (n & 1);
            n >>= 1;
        }
        return n + r;
    }
    
    /** 
     * Adds a sequence x_1, ..., x_n of run lengths to <code>runs</code> such that:<br>
     * 1. X = x_1 + ... + x_n <br>
     * 2. x_j >= minRun for all j <br>
     * 3. x_1 + ... + x_{j-2}  <  x_j  <  x_1 + ... + x_{j-1} for all j <br>
     * These conditions guarantee that TimSort merges all x_j's one by one
     * (resulting in X) using only merges on the second-to-last element.
     * @param X  The sum of the sequence that should be added to runs.
     */
    private void generateJDKWrongElem(long X) {
    	for(long newTotal; X >= 2*minRun+1; X = newTotal) {
    	    //Default strategy
    		newTotal = X/2 + 1;
    		
    		//Specialized strategies
    		if(3*minRun+3 <= X && X <= 4*minRun+1) {
    		    // add x_1=MIN+1, x_2=MIN, x_3=X-newTotal  to runs
    			newTotal = 2*minRun+1; 
    		} else if(5*minRun+5 <= X && X <= 6*minRun+5) {
    		    // add x_1=MIN+1, x_2=MIN, x_3=MIN+2, x_4=X-newTotal  to runs
    			newTotal = 3*minRun+3;
    		} else if(8*minRun+9 <= X && X <= 10*minRun+9) {
    		    // add x_1=MIN+1, x_2=MIN, x_3=MIN+2, x_4=2MIN+2, x_5=X-newTotal  to runs
    			newTotal = 5*minRun+5;
    		} else if(13*minRun+15 <= X && X <= 16*minRun+17) {
    		    // add x_1=MIN+1, x_2=MIN, x_3=MIN+2, x_4=2MIN+2, x_5=3MIN+4, x_6=X-newTotal  to runs
    			newTotal = 8*minRun+9;
    		}
    		runs.add(0, X-newTotal);
    	}
    	runs.add(0, X);
    }

    /** 
     * Fills <code>runs</code> with a sequence of run lengths of the form<br>
     * Y_n     x_{n,1}   x_{n,2}   ... x_{n,l_n} <br>
     * Y_{n-1} x_{n-1,1} x_{n-1,2} ... x_{n-1,l_{n-1}} <br>
     * ... <br>
     * Y_1     x_{1,1}   x_{1,2}   ... x_{1,l_1}<br>
     * The Y_i's are chosen to satisfy the invariant throughout execution,
     * but the x_{i,j}'s are merged (by <code>TimSort.mergeCollapse</code>)
     * into an X_i that violates the invariant. 
     * @param X  The sum of all run lengths that will be added to <code>runs</code>.
     */
    private void runsJDKWorstCase(int length) {
    	long runningTotal = 0, Y=minRun+4, X=minRun;
    	
    	while(runningTotal+Y+X <= length) {
            runningTotal += X + Y;
            generateJDKWrongElem(X);
    		runs.add(0,Y);
    		
    		// X_{i+1} = Y_i + x_{i,1} + 1, since runs.get(1) = x_{i,1} 
    		X = Y + runs.get(1) + 1;
    		
    		// Y_{i+1} = X_{i+1} + Y_i + 1
    		Y += X + 1;
    	}
    	
    	if(runningTotal + X <= length) {
            runningTotal += X;
            generateJDKWrongElem(X);
    	}
    	
    	runs.add(length-runningTotal);
    }

    private Integer[] createArray(int length) {
        Integer[] a = new Integer[length];
        Arrays.fill(a, 0);
        int endRun = -1;
        for(long len : runs)
            a[endRun+=len] = 1;
        a[length-1]=0;
        
        return a;
    }

    public static Integer[] JDKWorstCase(int length) {
        TestTimSort t = new TestTimSort(length);
        t.runsJDKWorstCase(length);
        return t.createArray(length);
    }

}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
There are two options: 
 1. In the constructor of the TimSort class, change 
        int stackLen = (len <    120  ?  5 :
                        len <   1542  ? 10 :
                        len < 119151  ? 24 : 40);
to

        int stackLen = (len <    120  ?  5 :
                        len <   1542  ? 10 :
                        len < 119151  ? 24 : 49);

 2. Change the mergeCollapse method of TimSort to:
    private void mergeCollapse() {
        while (stackSize > 1) {
            int n = stackSize - 2;
            if (     n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]
                || n-1 > 0 && runLen[n-2] <= runLen[n] + runLen[n-1]) {
                if (runLen[n - 1] < runLen[n + 1])
                    n--;
            } else if (n<0 || runLen[n] > runLen[n + 1]) {
                break; // Invariant is established
            }
            mergeAt(n);
        }
    }



Comments
fix posted for review: http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-February/031405.html
11-02-2015

Received following response from the submitter: ---------------------------------------------------------------------------------------------------- Thank you for the reply. I can reproduce the bug both with Oracle JDK 7u76 and with 8u31 (see the output below). Are you sure you're passing the right command line parameter? (execute java TestTimSort 67108864, or numbers higher than this). cdegouw@ubuntu:~/sorting/TimSort_src/benchmark$ java -version java version "1.7.0_76" Java(TM) SE Runtime Environment (build 1.7.0_76-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode) cdegouw@ubuntu:~/sorting/TimSort_src/benchmark$ java TestTimSort 67108864 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 40 at java.util.TimSort.pushRun(TimSort.java:386) at java.util.TimSort.sort(TimSort.java:213) at java.util.TimSort.sort(TimSort.java:173) at java.util.Arrays.sort(Arrays.java:659) at TestTimSort.main(TestTimSort.java:18) cdegouw@ubuntu:~/sorting/TimSort_src/benchmark$ ~/jdk1.8.0_31/bin/java -version java version "1.8.0_31" Java(TM) SE Runtime Environment (build 1.8.0_31-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode) cdegouw@ubuntu:~/sorting/TimSort_src/benchmark$ ~/jdk1.8.0_31/bin/java TestTimSort 67108864 Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 40 at java.util.TimSort.pushRun(TimSort.java:413) at java.util.TimSort.sort(TimSort.java:240) at java.util.Arrays.sort(Arrays.java:1438) at TestTimSort.main(TestTimSort.java:18) Best regards, Stijn --------------------------------------------------------------------------------------------------------------- Also, tested this with latest ea versions of JDK 8u40 (b23) and 9 (b49) and could reproduce this issue. This run fine with JDK 6u43.
11-02-2015

Sent an email to the submitter requesting confirmation with Oracle JDK: ---------------------------------------------------------------------------------------------------------------------- On 2/10/2015 9:38 AM, wrote: > Hi Stijn, > > The bug reported by you is with OpenJDK IcedTea version. I couldn't reproduce this issue with Oracle JDK versions 7u76 or 8u31. > Can you recheck with Oracle JDK versions 7u76 and 8u31 and confirm back. -----------------------------------------------------------------------------------------------------------------------
10-02-2015