JDK-4957674 : (coll) Hash entries placed into wrong buckets during deserialization
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.util:collections
  • Affected Version: 1.4.2,5.0
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: linux,solaris_8
  • CPU: generic,x86
  • Submitted: 2003-11-20
  • Updated: 2013-10-07
Related Reports
Relates :  
Relates :  
Description

Name: rmT116609			Date: 11/20/2003


FULL PRODUCT VERSION :
java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-b28)
Java HotSpot(TM) Client VM (build 1.4.2-b28, mixed mode)


A DESCRIPTION OF THE PROBLEM :
When deserializing a HashMap, the readObject() method reads the key-value pairs and re-hashes the map by calling hashCode() on the keys.

But if the keys' implementation of hashCode() depends on some internal variable of the key, and if that variable has not yet been deserialized at that moment, then hashCode() will give the wrong result, sending the key into the wrong hash bucket.


STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
The attached source code gives an example in which, after deserialization, a HashMap contains a single entry, with a non-null value.  The entry gets put into the wrong bucket.  The non-null value can be seen with hashMap.values() but hashMap.get(key) always returns null (hence the final assertion fails).

A workaround in this particular example is to move the variable x from class SuperY down to class ClassY.  In this case the assertion succeeds.   The difference is that, as shown, the superclass's variable (x) is always deserialized before the subclass's variable (id), but if x were declared in ClassY, the variable x would be deserialized *after* the id.


EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The assertions should not fail.  The hashcode() method should never be called at a time when it returns 0; the hashcode of the y object should always be 7.
ACTUAL -
During deserialization, the hashcode is temporarily 0, causing the key to go into the wrong bucket.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.*;
import java.io.*;

public class Test {
  public static void main(String[] a) throws IOException, ClassNotFoundException {
    ClassX x = new ClassX();
    ClassY y = (ClassY) x.hash.keySet().iterator().next();

    // The HashMap contains a non-null key:
    assert(y.x.hash.get(y).equals("NON-NULL"));

    // Serialize y and then reconstruct it from the serialized stream
    y = (ClassY) serializeAndDeserialize(y);

    // The hash contains exactly one key, namely y
    assert(y.x.hash.size()==1);
    assert(y.x.hash.keySet().iterator().next()==y);
    
    // The hash contains exactly one value, which is non-null
    assert(y.x.hash.values().iterator().next().equals("NON-NULL"));
    
    // However, attempting to get(y) fails,
    // because it has been put into the wrong bucket
    assert(y.x.hash.get(y)!=null);   // FAILS

  }

  private static Object serializeAndDeserialize(ClassY y) throws IOException, ClassNotFoundException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectOutputStream oos = new ObjectOutputStream(baos);
    oos.writeObject(y);
    oos.flush();
    baos.flush();
    byte[] result = baos.toByteArray();
    oos.close();
    ByteArrayInputStream bais = new ByteArrayInputStream(result);
    ObjectInputStream ois = new ObjectInputStream(bais);
    Object o = ois.readObject();
    ois.close();
    return o;
  }
    
}



  class ClassX implements Serializable {
    public final HashMap hash = new HashMap();
  
    public ClassX() {
      hash.put(new ClassY(this), "NON-NULL");
    }
  }
  
  class SuperY implements Serializable {
    public ClassX x;
  }
  
  class ClassY extends SuperY {
    private long id=7;
  
    public ClassY(ClassX y) {
      this.x = y;
    }
  
    public int hashCode() {
      // This method should always return 7, but observe
      // that during deserialization it is called by hashMap.readObject() and
      // at that time returns zero
      System.out.println("Hashcode should be 7.  Right now it is: "+id);
      return (int) id;
    }
  
    public boolean equals(Object o) {
      return id == ((ClassY) o).id;
    }
  }

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
(1) Do not serialize HashMaps that are part of a circular object graph.
(2) Use the default implementation of HashCode derived from Object.  (This is, of course, not generally possible or desirable.)
(3) For maps containing keys which are known not to change their hashCode as a side effect of deserialization (which implies that these objects do not use the default hashCode), re-implement the HashMap class using a readObject() which does not cause a re-hashing of the map, nor invoke the keys' hashCode() methods.
(Incident Review ID: 194129) 
======================================================================

###@###.### wrote the following simpler, and perhaps more compelling, program demonstrating the issue:

import java.util.*;
import java.io.*;

/** A simple class that delegates some operations to a set. */
class A implements Serializable {
    private Set set = new HashSet();
    A() {}

    public void add(Object o) {
        set.add(o);
    }
    public boolean test(Object o) {
        return set.contains(o);
    }
    public void check() {
        for (Object o : set)
            if (!set.contains(o))
                throw new Error();
    }
}

/** A slightly more complicated version.  This one assigns a nonce to
 *  each instance; the nonce defines an equivalence relation among Bs.
 */
class B extends A implements Serializable {
    private int nonce;
    B(int nonce) {
        this.nonce = nonce;
    }
    public int hashCode() {
        return nonce;
    }
    public boolean equals(Object other) {
        return (other instanceof B) && ((B)other).nonce == nonce;
    }
}

class Main {
    static public Object deepCopy(Object oldObj) {
        try {
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            ObjectOutputStream oos = new ObjectOutputStream(bos);
            oos.writeObject(oldObj);
            oos.flush();
            ByteArrayInputStream bin = new ByteArrayInputStream(bos.toByteArray());
            ObjectInputStream ois = new ObjectInputStream(bin);
            return ois.readObject();
        } catch(Exception e) {
            throw new IllegalArgumentException(e);
        }
    }

    static B makeB() {
        B b = new B(12);
        b.add(b);
        return b;
    }

    public static void main(String[] args) {
        B b1 = makeB();
        b1.check();

        B b2 = (B)deepCopy(b1);
        b2.check();
    }
}

###@###.### 2004-02-08

Comments
EVALUATION This is a fairly complex issue. While it is clear there is a problem, it is not clear where the problem lies. Possibilities include (1) HashMap (as well as HashSet and Hashtable), (2) The client program, and (3) the serialization system. If the answer is (1) or (2), a set of guidelines should be formulated for writing readObject (and readResolve) methods that avoid this difficulty. Needless to say, this issue merits further study. ###@###.### 2004-02-08
08-02-2004