United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6428387 array clone() much slower than Arrays.copyOf
JDK-6428387 : array clone() much slower than Arrays.copyOf

Details
Type:
Bug
Submit Date:
2006-05-20
Status:
Resolved
Updated Date:
2010-12-03
Project Name:
JDK
Resolved Date:
2007-05-24
Component:
hotspot
OS:
generic
Sub-Component:
compiler
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:
hs10 (b13)

Related Reports
Backport:
Backport:
Relates:
Relates:

Sub Tasks

Description
If we have an array a, say of length 1,
one might expect 
a.clone() 
to be equally fast or perhaps even faster than the equivalent
Arrays.copyOf(a, 1)
or
Object[] x = new Object[1];
System.arraycopy(a, 0, x, 0, 1);

but it is in fact much slower.

------------------------------------
import java.util.*;

public class ArrayCopyMicroBenchmark {
    abstract static class Job {
	private final String name;
	Job(String name) { this.name = name; }
	String name() { return name; }
	abstract void work() throws Throwable;
    }

    private static void collectAllGarbage() {
	try {
	    for (int i = 0; i < 2; i++) {
		System.gc(); System.runFinalization(); Thread.sleep(10);
	    }
	} catch (InterruptedException e) { throw new Error(e); }
    }

    /**
     * Runs each job for long enough that all the runtime compilers
     * have had plenty of time to warm up, i.e. get around to
     * compiling everything worth compiling.
     * Returns array of average times per job per run.
     */
    private static long[] time0(Job ... jobs) throws Throwable {
	final long warmupNanos = 10L * 1000L * 1000L * 1000L;
	long[] nanoss = new long[jobs.length];
	for (int i = 0; i < jobs.length; i++) {
	    collectAllGarbage();
	    long t0 = System.nanoTime();
	    long t;
	    int j = 0;
	    do { jobs[i].work(); j++; }
	    while ((t = System.nanoTime() - t0) < warmupNanos);
	    nanoss[i] = t/j;
	}
	return nanoss;
    }

    private static void time(Job ... jobs) throws Throwable {

	long[] warmup = time0(jobs); // Warm up run
	long[] nanoss = time0(jobs); // Real timing run

	final String nameHeader = "Method";
	int nameWidth  = nameHeader.length();
	for (Job job : jobs)
	    nameWidth = Math.max(nameWidth, job.name().length());

	final String millisHeader = "Millis";
	int millisWidth  = millisHeader.length();
	for (long nanos : nanoss)
	    millisWidth =
		Math.max(millisWidth,
			 String.format("%d", nanos/(1000L * 1000L)).length());

	final String ratioHeader = "Ratio";
	int ratioWidth = ratioHeader.length();

	String format = String.format("%%-%ds %%%dd %%.3f%%n",
				      nameWidth, millisWidth);
	String headerFormat = String.format("%%-%ds %%-%ds %%-%ds%%n",
					    nameWidth, millisWidth, ratioWidth);
	System.out.printf(headerFormat, "Method", "Millis", "Ratio");

	// Print out absolute and relative times, calibrated against first job
	for (int i = 0; i < jobs.length; i++) {
	    long millis = nanoss[i]/(1000L * 1000L);
	    double ratio = (double)nanoss[i] / (double)nanoss[0];
	    System.out.printf(format, jobs[i].name(), millis, ratio);
	}
    }

    private static int intArg(String[] args, int i, int defaultValue) {
	return args.length > i ? Integer.parseInt(args[i]) : defaultValue;
    }

    private static void deoptimize(Object[] a) {
	for (Object x : a)
	    if (x == null)
		throw new Error();
    }

    public static void main(String[] args) throws Throwable {
	final int iterations = intArg(args, 0, 100000);
	final int size       = intArg(args, 1, 1000);
	final Object[] array = new Object[size];
	final Random rnd = new Random();
	for (int i = 0; i < array.length; i++)
	    array[i] = rnd.nextInt(size);

	time(
	    new Job("arraycopy") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    System.arraycopy(a, 0, t, 0, size);
		    a = t;}
		deoptimize(a);}},
	    new Job("copyOf") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = Arrays.copyOf(a, size);
		deoptimize(a);}},
	    new Job("clone") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = a.clone();
		deoptimize(a);}},
	    new Job("loop") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    for (int j = 0; j < size; j++)
			t[j] = a[j];
		    a = t;}
		deoptimize(a);}}
	    );
    }
}
------------------------------------

solaris-sparc:

$ (iterations=10000000 size=1 && for f in -client -server; do echo $f iterations=$iterations size=$size; jver 6 java $f ArrayCopyMicroBenchmark $iterations $size 1; done);
-client iterations=10000000 size=1
Method    Millis Ratio
arraycopy   1472 1.000
copyOf      2745 1.864
clone       8451 5.739
loop         854 0.580
-server iterations=10000000 size=1
Method    Millis Ratio
arraycopy    853 1.000
copyOf       953 1.117
clone       8276 9.698
loop         835 0.979

solaris-i586:

-client iterations=10000000 size=1
Method    Millis Ratio
arraycopy    642 1.000
copyOf      1106 1.723
clone       3931 6.121
loop         227 0.354
-server iterations=10000000 size=1
Method    Millis Ratio
arraycopy    260 1.000
copyOf       366 1.406
clone       4088 15.703
loop         246 0.947

I thought about whether it would even be possible for javac to compile
a call to clone() into a call to Arrays.copyOf in the usual case where
it is known at javac compile time that it is an array that is being cloned.
There might be some compatibility reason why this is not possible,
but I can't think of one, at least when -target is >= mustang

                                    

Comments
EVALUATION

The missing bit here is a compiler intrinsic for Object.clone.  There is not much hope of making an allocation really fast unless the compiler can inline it.

As it happens, I'm already working on a fix for this (in -server) under the rubric of 6337834, which required significant changes to allocation code shape, allowing (as a side effect) a clone intrinsic.

If this is a problem in -client also, this bug could be retargeted to compiler1.
                                     
2006-05-22
EVALUATION

I've attached a more elaborate version of the microbenchmark.
It performs array copying using arraycopy, hand-written loops,
Object.clone, Arrays.copyOf, and Unsafe.copyMemory.
Array types tested include Object[], Integer[], and malloc blocks.

Copying from Object[] to Integer[] requires a checkcast
on every element, and the arraycopy and copyOf intrinsics
have new hand-written assembly for this, which run
somewhat faster than the Java code.

The relevant intrinsics are:
  old copy loops: arraycopy
  new copy loops: clone, copyOf, copyOfRange, copyMemory
  old non-loops: getInt_raw, putInt_raw, getClass
  new non-loops: getClass, getComponentType, newArray

The new non-loop intrinsics allow the compiler to optimize
polymorphic array construction expressions.
                                     
2007-04-13



Hardware and Software, Engineered to Work Together