JDK-6460346 : Can't serialize an object graph with more than 184,549,375 elements.
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 5.0
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_10
  • CPU: sparc
  • Submitted: 2006-08-15
  • Updated: 2010-08-06
  • Resolved: 2006-08-29
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
A DESCRIPTION OF THE REQUEST :
We are using the 64-bit JVM with very large heaps - typically around 50 GBytes. During object serialization, we have recently started encountering a new class of error never previously seen before:

java.lang.OutOfMemoryError: Requested array size exceeds VM limit
exception.

This appears to be caused by the following:

ObjectOutputStream.java creates an instance of Object[] to hold data in its HandleTable class - which is a type of HashTable. The initial size is 10, but when the array's capacity is exhausted a new array is created which is of size [2n+1] where n was the old array capacity.

Due to this algorithm, the array size grows like this:


10
21
43
87
175
351
703
1407
2815
5631
11263
22527
45055
90111
180223
360447
720895
1441791
2883583
5767167
11534335
23068671
46137343
92274687
184549375
369098751

The maximum permitted number of elements for an instance of Object[] is 268,435,456. So when the serialization code tries to double the capacity to 369,098,751 it crashes with an OutOfMemoryError, which is the default response to an attempt to create an array with too many elements.

More generally, the whole collections framework is affected by this limit in what feels like a rather arbitrary manner.

For example, the default capacity of an ArrayList is 10, and the algorithm is to simply double the capacity when the current array is full. This means that a default ArrayList will crash with an OutOfMemoryError when the total number of elements gets to 167,772,160. However if the capacity is initially set to 15 then the OutOfMemoryError happens when the total number of elements gets to 251,658,240.

All of this seems very arbitrary and is not well documented anywhere to the best of my knowledge.

JUSTIFICATION :
We'd like to be able to serialize arbitrarily large object graphs.

I'm submitting this as an RFE rather than a bug because it all appears to  be caused by a design choice in the implementation of hotspot, which is that no object can be more than 2**31 bytes long, even in a 64-bit JVM.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I'd like serialization of a Vector or other collection to happen without an OutOfMemoryError caused by a hotspot restriction mixing with an internal algorithmic detail of the implementation of serialization.
ACTUAL -
Serialization crashes with OutOfMemoryError.

---------- BEGIN SOURCE ----------
Try compiling this bit of code. Then run it on a 64-bit JVM with lots of heap.

java -Xmx10g -Xms10g Serial 185000000

import java.io.OutputStream;
import java.io.ObjectOutputStream;
import java.util.Vector;

class MyOutputStream extends OutputStream {
        public MyOutputStream() {}

        public void close() {}
        public void flush() {}
        public void write(byte[] b) {}
        public void write(byte[] b, int off, int len) {}
        public void write(int b) {}
}

public class Serial extends Thread {
        public static void main(String[] args) throws Exception {
                int length = Integer.parseInt(args[0]);
                MyOutputStream mos = new MyOutputStream();
                ObjectOutputStream oos = new ObjectOutputStream(mos);

                Vector vector = new Vector(15);
                System.out.println("CREATING VECTOR");
                for (int i=0; i<length; i++) {
                        if (i%10000000==0)
                                System.out.println(i);
                        vector.add(new Integer(i));
                }
                System.out.println("CREATED VECTOR");
                long startTime = System.currentTimeMillis();
                System.out.println("Beginning serialization");
                oos.writeObject(vector);
                long endTime = System.currentTimeMillis();
                System.out.println("Elapsed time = " + (endTime-startTime) + " msecs");
        }
}


---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
None. Can't serialize the object graph. We need to find a mechanism for dividing the object graph into n sections where each section has fewer than 184,500,000 unique objects. I'm not quite sure how to do that...

Comments
EVALUATION -------------------------- The internal array size limitation was addressed as part of 5089202 and is fixed in Java 6.
29-08-2006

EVALUATION Sorry - my previous evaluation was incorrect as I misunderstood the numbers and thought that "maximum permitted" referred to the constraint that indices must be integer values. The submitter is correct there is an internal hotspot limitation that constrains the maximum size of an object to be representable by a signed 32-bit int value.
28-08-2006

EVALUATION The statement: "it all appears to be caused by a design choice in the implementation of hotspot, which is that no object can be more than 2**31 bytes long, even in a 64-bit JVM." is incorrect. The Java Language Specification states that arrays are indexed by int values and int values are 32-bit. This does not change because the VM is running on a 64-bit platform. There is an existing RFE to allow 64-bit indexing of arrays: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4880587 The specific issue with the way in which internal arrays are grown is being referred to the libraries team. I have reclassified as a defect rather than RFE as it should be possible to grow such arrays to the maximum allowed size.
16-08-2006