United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6498658 System.arraycopy performance lags
JDK-6498658 : System.arraycopy performance lags

Details
Type:
Bug
Submit Date:
2006-11-29
Status:
Resolved
Updated Date:
2010-05-20
Project Name:
JDK
Resolved Date:
2007-03-15
Component:
hotspot
OS:
generic
Sub-Component:
compiler
CPU:
generic
Priority:
P2
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:
hs10 (b10)

Related Reports
Backport:
Backport:
Relates:
Relates:

Sub Tasks

Description
System.arraycopy() of char[] types is a bottleneck for many appserver benchmarks (specifically jshort_disjoint_arraycopy). We made some optimizations to the appserver around the way in which it handles certain data; essentially we changed a number of char[] to byte[]. Because of the encoding used, the byte arrays are half as long as the char arrays, and we expected a decrease in the amount of time the appserver spends copying these arrays. Instead, our general performance regressed, and the amount of time in System.arraycopy (now in jbyte_disjoint_array) doubled.

The attached test program shows the problem; somehow, c2 is using type information and not optimizing the system array copy. The test program has two types of methods. The first method uses a specific type:
    public void doit(byte[] b1, byte[] b2, int len) {
         System.arraycopy(b1, 0, b2, 0, len);
    }

The second method uses a generic type:
    public void doit(Object o1, Object o2, int len) {
          System.arraycopy(o1, 0, o2, 0, len);
    }

On Solaris/Sparc, copy various arrays gives this performance:

Time to copy 1024 bytes (1024 bytes): 436
Time to copy 512 chars (1024 bytes): 255
Time to copy 1024 chars (2048 bytes): 439
Time to copy 256 ints (1024 bytes): 254
Time to copy 1024 ints (4096 bytes): 872
Time to copy (generic interface) 1024 bytes (1024 bytes): 340
Time to copy (generic interface) 512 chars (1024 bytes): 387
Time to copy (generic interface) 256 ints (1024 bytes): 387

The first two cases are copying the same amount of data (using a method with an explicit type defined) and hence should take the same amount of time. 

The very odd thing is that the last three cases all also copy 1024 bytes (using the Object-type interface) and take the same amount of time regardless of the actual data type (but still take longer than the best cases where the type is known).

With C1, the times are roughly the same (in fact, they favor byte[] copying slightly).

I listed the OS/hardware as generic, but in fact I've observed this only on Solaris (both sparc and x86) and windows (i586). On Linux (i586) the performance was as expected (amount of time was always dependent on total number of bytes copied).

                                    

Comments
SUGGESTED FIX

Optimize arraycopy stubs for all types. I rewrote all of them 
(except for amd64 which were well optimized already).
I also added generic arraycopy stub when Object passed
as arrays. The stub does a dynamic checks and jumps to
a type specific stub if arrays well defined or return -1 
to go slow path. This is what C1 is doing for some time.
I also added stack frame for stubs on x86 since it
is helpful and does not hurt performance.

Appserver performance improved by 6% on N1 with this fix.
I added different platforms results to the bug report
with a modified the test case.
                                     
2007-01-11
EVALUATION

We have optimized stubs only for char arraycopy.
                                     
2007-01-11
SUGGESTED FIX

Webrev:                 http://prt-web.sfbay.sun.com/net/prt-archiver.sfbay/data/archived_workspaces/main/c2_baseline/2007/20070214151808.kvn.6498658/workspace/webrevs/webrev-2007.02.14/index.html
                                     
2007-02-15



Hardware and Software, Engineered to Work Together