JDK-6428387 : array clone() much slower than Arrays.copyOf
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 6
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2006-05-20
  • Updated: 2019-04-29
  • Resolved: 2007-05-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u4Fixed 7Fixed hs10Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
If we have an array a, say of length 1,
one might expect 
a.clone() 
to be equally fast or perhaps even faster than the equivalent
Arrays.copyOf(a, 1)
or
Object[] x = new Object[1];
System.arraycopy(a, 0, x, 0, 1);

but it is in fact much slower.

------------------------------------
import java.util.*;

public class ArrayCopyMicroBenchmark {
    abstract static class Job {
	private final String name;
	Job(String name) { this.name = name; }
	String name() { return name; }
	abstract void work() throws Throwable;
    }

    private static void collectAllGarbage() {
	try {
	    for (int i = 0; i < 2; i++) {
		System.gc(); System.runFinalization(); Thread.sleep(10);
	    }
	} catch (InterruptedException e) { throw new Error(e); }
    }

    /**
     * Runs each job for long enough that all the runtime compilers
     * have had plenty of time to warm up, i.e. get around to
     * compiling everything worth compiling.
     * Returns array of average times per job per run.
     */
    private static long[] time0(Job ... jobs) throws Throwable {
	final long warmupNanos = 10L * 1000L * 1000L * 1000L;
	long[] nanoss = new long[jobs.length];
	for (int i = 0; i < jobs.length; i++) {
	    collectAllGarbage();
	    long t0 = System.nanoTime();
	    long t;
	    int j = 0;
	    do { jobs[i].work(); j++; }
	    while ((t = System.nanoTime() - t0) < warmupNanos);
	    nanoss[i] = t/j;
	}
	return nanoss;
    }

    private static void time(Job ... jobs) throws Throwable {

	long[] warmup = time0(jobs); // Warm up run
	long[] nanoss = time0(jobs); // Real timing run

	final String nameHeader = "Method";
	int nameWidth  = nameHeader.length();
	for (Job job : jobs)
	    nameWidth = Math.max(nameWidth, job.name().length());

	final String millisHeader = "Millis";
	int millisWidth  = millisHeader.length();
	for (long nanos : nanoss)
	    millisWidth =
		Math.max(millisWidth,
			 String.format("%d", nanos/(1000L * 1000L)).length());

	final String ratioHeader = "Ratio";
	int ratioWidth = ratioHeader.length();

	String format = String.format("%%-%ds %%%dd %%.3f%%n",
				      nameWidth, millisWidth);
	String headerFormat = String.format("%%-%ds %%-%ds %%-%ds%%n",
					    nameWidth, millisWidth, ratioWidth);
	System.out.printf(headerFormat, "Method", "Millis", "Ratio");

	// Print out absolute and relative times, calibrated against first job
	for (int i = 0; i < jobs.length; i++) {
	    long millis = nanoss[i]/(1000L * 1000L);
	    double ratio = (double)nanoss[i] / (double)nanoss[0];
	    System.out.printf(format, jobs[i].name(), millis, ratio);
	}
    }

    private static int intArg(String[] args, int i, int defaultValue) {
	return args.length > i ? Integer.parseInt(args[i]) : defaultValue;
    }

    private static void deoptimize(Object[] a) {
	for (Object x : a)
	    if (x == null)
		throw new Error();
    }

    public static void main(String[] args) throws Throwable {
	final int iterations = intArg(args, 0, 100000);
	final int size       = intArg(args, 1, 1000);
	final Object[] array = new Object[size];
	final Random rnd = new Random();
	for (int i = 0; i < array.length; i++)
	    array[i] = rnd.nextInt(size);

	time(
	    new Job("arraycopy") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    System.arraycopy(a, 0, t, 0, size);
		    a = t;}
		deoptimize(a);}},
	    new Job("copyOf") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = Arrays.copyOf(a, size);
		deoptimize(a);}},
	    new Job("clone") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = a.clone();
		deoptimize(a);}},
	    new Job("loop") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    for (int j = 0; j < size; j++)
			t[j] = a[j];
		    a = t;}
		deoptimize(a);}}
	    );
    }
}
------------------------------------

solaris-sparc:

$ (iterations=10000000 size=1 && for f in -client -server; do echo $f iterations=$iterations size=$size; jver 6 java $f ArrayCopyMicroBenchmark $iterations $size 1; done);
-client iterations=10000000 size=1
Method    Millis Ratio
arraycopy   1472 1.000
copyOf      2745 1.864
clone       8451 5.739
loop         854 0.580
-server iterations=10000000 size=1
Method    Millis Ratio
arraycopy    853 1.000
copyOf       953 1.117
clone       8276 9.698
loop         835 0.979

solaris-i586:

-client iterations=10000000 size=1
Method    Millis Ratio
arraycopy    642 1.000
copyOf      1106 1.723
clone       3931 6.121
loop         227 0.354
-server iterations=10000000 size=1
Method    Millis Ratio
arraycopy    260 1.000
copyOf       366 1.406
clone       4088 15.703
loop         246 0.947

I thought about whether it would even be possible for javac to compile
a call to clone() into a call to Arrays.copyOf in the usual case where
it is known at javac compile time that it is an array that is being cloned.
There might be some compatibility reason why this is not possible,
but I can't think of one, at least when -target is >= mustang

Comments
EVALUATION I've attached a more elaborate version of the microbenchmark. It performs array copying using arraycopy, hand-written loops, Object.clone, Arrays.copyOf, and Unsafe.copyMemory. Array types tested include Object[], Integer[], and malloc blocks. Copying from Object[] to Integer[] requires a checkcast on every element, and the arraycopy and copyOf intrinsics have new hand-written assembly for this, which run somewhat faster than the Java code. The relevant intrinsics are: old copy loops: arraycopy new copy loops: clone, copyOf, copyOfRange, copyMemory old non-loops: getInt_raw, putInt_raw, getClass new non-loops: getClass, getComponentType, newArray The new non-loop intrinsics allow the compiler to optimize polymorphic array construction expressions.
13-04-2007

EVALUATION The missing bit here is a compiler intrinsic for Object.clone. There is not much hope of making an allocation really fast unless the compiler can inline it. As it happens, I'm already working on a fix for this (in -server) under the rubric of 6337834, which required significant changes to allocation code shape, allowing (as a side effect) a clone intrinsic. If this is a problem in -client also, this bug could be retargeted to compiler1.
22-05-2006