FULL PRODUCT VERSION :
java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode, sharing)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]
A DESCRIPTION OF THE PROBLEM :
Please look at the implementation of Arrays.hashCode(long a[]) method. The basic loop is the following:
for (long element : a) {
int elementHash = (int)(element ^ (element >>> 32));
result = 31 * result + elementHash;
}
It's obvious that if low and high words in the element are equal, then elementHash will be zero. There is, at least, two very popular integers producing this result: 0 and -1. For both values, the expression "(int)(element^(element>>>32))" returns 0.
This problem really occurred in one of my applications. By default, long[] arrays are filled by zero, and -1 is used sometimes as usual or signal value. If I create new long array[], calculate hash code, then assign -1 to some elements and calculate hash code again, I see the same result. I think it is a bug: the probability of same hash codes for different arrays is too high and can lead to serious problems in algorithms that often assign -1 to elements of long[] arrays.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Please compile and run the attached test.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Different hash codes for the tested arrays.
ACTUAL -
Hash code is not changed while any long[] array modifications where 0 element is replaced with -1 or vice versa.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.Arrays;
public class Test {
public static void main(String[] args) {
long[] a = new long[10];
for (int k = 0; k < a.length; k++)
System.out.print(a[k] + " ");
System.out.println("hashCode = " + Arrays.hashCode(a));
a[0] = -1;
for (int k = 0; k < a.length; k++)
System.out.print(a[k] + " ");
System.out.println("hashCode = " + Arrays.hashCode(a));
a[5] = -1;
for (int k = 0; k < a.length; k++)
System.out.print(a[k] + " ");
System.out.println("hashCode = " + Arrays.hashCode(a));
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
You could use crc32 native algorithm for hash code calculation for any arrays. Measuring shows that your java.util.zip.CRC32.update(byte[] b) method has almost the same performance as Arrays.hashCode. Now I've created own methods for calculation array hash codes, that convert all non-byte primitive types to byte arrays and then call CRC32.update. I think you could implement more efficient solution if you offer native methods for arrays of all primitive types.