JDK-6208166 : Deserialization fails with cyclic object graph using HashSet
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.io:serialization
  • Affected Version: 1.4.2
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: windows_2000
  • CPU: x86
  • Submitted: 2004-12-14
  • Updated: 2024-06-12
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.4.2_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_04-b05)
Java HotSpot(TM) Client VM (build 1.4.2_04-b05, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows 2000 [Version 5.00.2195]

A DESCRIPTION OF THE PROBLEM :
De-serializing an instance of this class throws a NullPointerException:

    import java.io.*;
    import java.util.*;

    private class MyClass implements Serializable
    {
        private String m_name = "anything";
        private Set m_set = new HashSet ();

        public MyClass ()
        {
            m_set.add (this);
        }

        public int hashCode ()
        {
            return m_name.hashCode ();
        }
    }

I think this happens because the VM attempts to de-serialize m_set first and in doing so calls hashCode() on each of its members. However, the instance of MyClass, contained in m_set, has not yet been fully de-serialized since this involves de-serializing m_set first. Thus, m_name is still null when hashCode() is called, causing the Exception.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Serialize and then De-serialize an instance of MyClass using the program provided.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I was expecting to see  MyClass De-serialize completely and wihout error.
ACTUAL -
A NullPointerException was thrown from the line:

           return m_name.hashCode ();

m_name was null.

ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.lang.NullPointerException
	at com.bentley.test.MyClass.hashCode(TestSerialization.java:67)
	at java.util.HashMap.hash(HashMap.java:261)
	at java.util.HashMap.put(HashMap.java:379)
	at java.util.HashSet.readObject(HashSet.java:277)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:324)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:838)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1746)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1845)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1769)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1646)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1274)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
	at com.bentley.test.TestSerialization.input(TestSerialization.java:46)
	at com.bentley.test.TestSerialization.main(TestSerialization.java:23)
Exception in thread "main"

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
/*--------------------------------------------------------------------------------------+
 |
 |    $RCSfile: TestSerialization.java,v $
 |   $Revision: 1.0 $
 |       $Date: 2004/08/24 14:31:30 $
 |
 |  $Copyright: (c) 2004 Bentley Systems, Incorporated. All rights reserved. $
 |
 +--------------------------------------------------------------------------------------*/

package com.bentley.test;

import java.io.*;
import java.util.*;


public class TestSerialization
{
    public static void main (String[] args) throws Exception
    {
        MyClass obj = new MyClass ();
        byte[] serialized = output (obj);
        Serializable deserialized = input (serialized);
    }

    public static byte[] output (Serializable obj) throws Exception
    {
        ByteArrayOutputStream obytes = new ByteArrayOutputStream ();
        ObjectOutputStream ostream = new ObjectOutputStream (obytes);

        ostream.writeObject (obj);
        ostream.flush ();
        byte[] bytes = obytes.toByteArray ();

        obytes.close ();
        ostream.close ();

        return bytes;
    }

    public static Serializable input (byte[] bytes) throws Exception
    {
        ByteArrayInputStream ibytes = new ByteArrayInputStream (bytes);
        ObjectInputStream istream = new ObjectInputStream (ibytes);

        Serializable obj = (Serializable)istream.readObject ();

        ibytes.close ();
        istream.close ();

        return obj;
    }
}

class MyClass implements Serializable
{
    private String m_name = "anything";
    private Set m_set = new HashSet ();

    public MyClass ()
    {
        m_set.add (this);
    }

    public int hashCode ()
    {
        return m_name.hashCode ();
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
I have not found a good one yet, other than to avoid overriding the hashCode method.
###@###.### 2004-12-14 10:26:52 GMT

Comments
Reopening this bug to cover the simple circularity among objects, not involving subclassing. Note that this bug covers pathologies that have failed on all JDK releases from 1.4 onwards. This differs from JDK-8201131, which involves circularity via subclassing, which worked until a change was made in JDK 9. A workaround for this class of failure is given by an earlier comment [1] by Peter Jones, which is to store the cyclic relationships in transient fields and to implement a readObject() method that fills in the field values and then populates the transient fields. This provides greater control over ordering. [1]: https://bugs.openjdk.org/browse/JDK-6208166?focusedId=12334140&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12334140
12-06-2024

Duplicates JDK-8201131.
12-06-2024

WORK AROUND A way to force the deserialization of the HashSet (as part of deserializing the MyClass) to be delayed until after the m_name field has been set would be to remove m_set from MyClass's serializable fields (either by declaring it transient or by adding an explicit serialPersistentFields declaration to MyClass) and instead handle its serialization/deserialization in custom writeObject/readObject methods of MyClass. For example: class MyClass2 implements Serializable { private String m_name = "anything"; private transient Set m_set = new HashSet(); public MyClass2() { m_set.add(this); } public int hashCode() { return m_name.hashCode(); } private void writeObject(ObjectOutputStream out) throws IOException { out.defaultWriteObject(); out.writeObject(m_set); } private void readObject(ObjectInputStream in) throws ClassNotFoundException, IOException { in.defaultReadObject(); m_set = (Set) in.readObject(); if (m_name == null || m_set == null) { throw new InvalidObjectException("null"); } } } (This would be an incompatible change to MyClass's serialized form.) Also, it's difficult to know exactly how this test case is a distillation of a real example in practice: if, for example, the HashSet can be entirely reconstructed by other data serialized with the object, then it need not be serialized with the object at all. ###@###.### 2004-12-15 01:07:26 GMT
15-12-2004

EVALUATION This behavior is somewhat of a variation on the problem described in 4957674; in this case, the failure is more explicit. See also: http://forum.java.sun.com/thread.jspa?forumID=62&threadID=349406 Note that the particular test case, as is, works with Sun's implementations of ObjectInputStream prior to 1.4 (i.e. it does not cause a NullPointerException), because pre-1.4 implementations, when deserializing the values of the non-primitive serializable fields of a class, would set each individual field's value after reading it (before deserializing the next value). 1.4's and later implementations set the values of all non-primitive serializable fields in a single batch, after they have all been deserialized, for improved performance. Non-primitive serializable fields are specified to be written and read in the lexicographical order of their names. "m_name" lexicographically precedes "m_set", so with pre-1.4 implementations, the m_name field's value is correctly set while the HashSet is being deserialized, and thus the NullPointerException does not occur. Of course, this success is probably coincidental: it was not the intention that serializable fields would be named according to the order in which their relative deserializaion is desired. Renaming the m_name field to "zm_name" causes the test caes to fail with pre-1.4 implementations as well. In general, the deserialization of a graph of objects with referential circularities can be tricky (as in 4957674). See Workaround for a way to avoid the problem in this case (with a different serialized form for MyClass). Also, sometimes it is helpful to employ the ObjectInputStream.registerValidation method to delay an action of reconstitution until the deserialized graph is otherwise more complete. ###@###.### 2004-12-15 01:07:26 GMT
15-12-2004