JDK-6809470 : RFE: Change System.identityHashCode() implementation to return a unique value for distinct object
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.util
  • Affected Version: 6u14
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: generic
  • CPU: generic
  • Submitted: 2009-02-24
  • Updated: 2011-07-25
  • Resolved: 2011-07-25
Related Reports
Relates :  
Description
J2SE Version (please include all output from java -version flag):
1.5.0_15, 1.6.0u14


Does this problem occur on J2SE 1.3, 1.4.x or 1.5?  Yes / No (pick one)
Yes

Bug Description:

System.identityHashCode() returns the same value for different objects

Steps to Reproduce (be specific):

Create about 100k objects of the same type and same content.
Java 1.5.0_15 for 100k objects: 174 collisions
Java 1.5.0_15 for 10k objects: 2 collisions
Java 1.5.0_15 for 1M objects: 14558 collisions
Java 1.6.0_12 for 100k objects: 173 collisions
Java 1.6.0_12 for 10k objects: 2 collisions
Java 1.6.0_12 for 1M objects: 14555 collisions

Problem: The function name "System._identity_HashCode()" implies, 
          that the returned value is unique (per JVM). But it is not.

RFE: Change System.identityHashCode() implementation to return a unique 
     value for each distinct object.


Code to reproduce:

import java.util.HashSet;
import java.util.Set;

public class CheckSystemIdentity {
    public static void main(String args[]) {
        Set<Integer> hashes = new HashSet<Integer>(1024);

        int colls = 0;
        for (int n = 0; n < 100000; n++) {
            Integer obj = new Integer(88);
            int ihash = System.identityHashCode(obj);
            Integer iho = Integer.valueOf(ihash);
            if (hashes.contains(iho)) {
                System.err.println("System.identityHashCode() collision!");
                colls++;
            }
            else {
                hashes.add(iho);
            }
        }

        System.out.println("created 100000 different objects - "
            + colls + " times with the same value for System.identityHashCode()");
    }

}
more information from the submitter:

Some additional information:

Our current problem is (was ;-) ):

We need to create a large amount of objects. These objects are of the same type (class) and have the same "content" (null/0 values).
The API above does not know about the current implementation of the objects - but the API has to sort sortable objects and just to maintain non-sortable objects.
We had the problem, that if the user created 9999 objects, the UI told that 999x (x=any number between 2 and 9) objects were in the set. But the collection behind had 9999 objects. Yeah - it's confusing.
We used System.identityHashCode() to distinguish individual objects. We did not expect, that System.identityHashCode() could return the same hash code for different objects.
System.identityHashCode() looked great because we did not have any other unique key for those objects.

If you cannot fix it, then please mark clearly in Object.hashCode() and System.identityHashCode() javadoc comments, that the probability that these functions return the same hash code for different objects is about 2 per 1000.

It would be better, that the default Object.hashCode() implementation returns a unique value per JVM - at least unique per class. This would limit the probability of "hash code collisions" to 1:4,294,967,296. If the default Object.toString() returns a really per-JVM unique value this would be really great - so one could calculate a "long" hash, if necessary.

Another benefit would be that hash tables can be used more efficient.

Comments
EVALUATION Identity hashcode values are based upon heap pointer address. In the provided example you are generating a lot of garbage objects (new Integer(88)). These garbage objects are readily reclaimed and the address space is reused. The collisons result from address space reuse. If the original object remains live (not GCed) then you will not encounter this problem.
25-07-2011