JDK-6855215 : Calculation error (NaN) after about 1500 calculations
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 6u14
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: windows_xp
  • CPU: x86
  • Submitted: 2009-06-26
  • Updated: 2011-03-08
  • Resolved: 2011-03-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 Other
6u18Fixed 7Fixed hs16Fixed
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.6.0_14"
Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
Java HotSpot(TM) Client VM (build 14.0-b16, mixed mode, sharing)

FULL OS VERSION :
Microsoft Windows XP [Wersja 5.1.2600]

EXTRA RELEVANT SYSTEM CONFIGURATION :
processor: AMD Athlon XP 2500+

A DESCRIPTION OF THE PROBLEM :
After about 1500 passes, the calculation in the attached code fails: m and b variables become NaN. Once this happens, all subsequent calculations using instances of the class also fail (a HOTSPOT bug?).

There are no exceptions thrown, but "m" and "b" variables are wrongly
calculated and assigned NaN(s).
Once this happens, all subsequent calculations of  "m" and "b" variables
also evaluate to NaN, even with different CalcError instances and different
calcMapping(...) parameters.
This happens after a certain number of calculations, on our computer: 1500.
Please not that this bug occurs EVERY TIME when we run the attached
CalcError test.
On JDK 1.4.2_13 (the other one we also use to test our applications) there
is no such error and everything works well.

I also tested the program on another (Intel based) computer with JRE
1.6.0_13 and 1.6.0_14 and everything worked well.

So this problem only occurs on our AMD Athlon XP 2500+ system with 1.6.0_14,
but it happens consistently every time we run the test. We've been using
this machine for a long time and haven't experienced any other problems with
it.

THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: No

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: No

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile and then run the attached code using default VM flags in JRE, for example:
java CalcError

EXPECTED VERSUS ACTUAL BEHAVIOR :
Expected: consistent calculation results.
Actual: After about 1500 calculations, all subsequent calculations result in a NaN.
REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
class CalcError {
	private double m;
	private double b;

	public static double log10(double x) {
		return Math.log(x) / Math.log(10);
	}

	void calcMapping(double xmin, double xmax, double ymin, double ymax) {
		if ((ymax != ymin) && (xmax != xmin) && (xmax > 0) && (xmin > 0)) {
			m = (ymax - ymin) / (log10(xmax) - log10(xmin));
			b = (log10(xmin) * ymax - log10(xmax) * ymin)
					/ (log10(xmin) - log10(xmax));
		} else {
			m = 1;
			b = 0;
		}
		System.out.println("m=" + m +", b=" +b);
	}

	public static void main(String[] args) {
		final int LOOP = 1600;
		CalcError c = new CalcError();
		for (int i = 0; i < LOOP; i++) {
		    System.out.print("[" + i + "]: ");
		    c.calcMapping(91, 121, 177, 34);
		}
	}
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Use -Xint or -server flag.

Release Regression From : 6u13
The above release value was the last known release where this 
bug was not reproducible. Since then there has been a regression.

Release Regression From : 6u13
The above release value was the last known release where this 
bug was not reproducible. Since then there has been a regression.

Comments
EVALUATION http://hg.openjdk.java.net/hsx/hsx16/master/rev/d19a5a05d449
04-10-2009

EVALUATION http://hg.openjdk.java.net/hsx/hsx16/baseline/rev/d19a5a05d449
30-09-2009

EVALUATION http://hg.openjdk.java.net/jdk7/hotspot-comp/hotspot/rev/ff1a29907b6c
01-09-2009

EVALUATION I can reproduce this all the way back to 1.6.0 when running witn -Xbatch. Without batch you may never run the compiled version of the code. This appears to be a day one bug with the handling of log and log10 when using the x86 FPU instead of SSE. The code emission for log/log10 requires an extra FPU stack slot as a temp and that wasn't being accounted for which lead to stack overflows during the calculation which caused the NaNs. The fix is to record temporaries for log and log10 in the same way it's done for tan/sin/cos.
29-06-2009