JDK-4857011 : Performance regression in trigonometic functions (very slow StrictMath)
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.4.0,1.4.2
  • Priority: P3
  • Status: Closed
  • Resolution: Not an Issue
  • OS: windows_2000
  • CPU: x86
  • Submitted: 2003-05-01
  • Updated: 2003-05-02
  • Resolved: 2003-05-02
Related Reports
Relates :  
Relates :  
Relates :  
Description

Name: rmT116609			Date: 05/01/2003


FULL PRODUCT VERSION :
JDK 1.4 and later

java version "1.4.2-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)


FULL OS VERSION :
Windows 2000

A DESCRIPTION OF THE PROBLEM :
The Math.cos and Math.sin procedures are several times SLOWER in JDK 1.4 than JDK 1.3. I'm sure this has to do with the adoption of the StrictMath library. Performance should not get drastically worse when you upgrade your JDK from 1.3 to 1.4.

Similar bugs have been reported, but they are listed as CLOSED. What does this mean? Is it going to be fixed in a future JDK release? Is it not going to be fixed?

Is the old (fast) Math library available somewhere as a JAR file? Can it be used in JDK 1.4 applications?

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println("Benchmark test..." + new java.util.Date());
double d = 0.5;
for (int i=0; i < 10000000; i++)
{
  d = (d * i) / 3.1415926;
  d = Math.sin(d);
  d = Math.cos(d);
  d = Math.sqrt(d) * 3.1415926;
}
System.out.println("Benchmark done. Time is " + new java.util.Date());
System.out.println("D=" + d);


EXPECTED VERSUS ACTUAL BEHAVIOR :
In JDK 1.3, 1.3.1_08 this loop takes 3 seconds

In JDK 1.4, 1.4.2-beta this loop takes 21 seconds

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
public class Benchmark {

public static void main(String[] args)
{
System.out.println("Benchmark test..." + new java.util.Date());
double d = 0.5;
for (int i=0; i < 10000000; i++)
{
  d = (d * i) / 3.1415926;
  d = Math.sin(d);
  d = Math.cos(d);
  d = Math.sqrt(d) * 3.1415926;
}
System.out.println("Benchmark done. Time is " + new java.util.Date());
System.out.println("D=" + d);

}

}

---------- END SOURCE ----------

Release Regression From : 1.3.1_08
The above release value was the last known release where this 
bug was known to work. Since then there has been a regression.

(Review ID: 185149) 
======================================================================

Name: rmT116609			Date: 05/01/2003


A DESCRIPTION OF THE REQUEST :
My numeric calculation intensive program runs 8 times slower with 1.4 as opposed to 1.3.1. I am doing Math.sin(), Math.cos(), and Math.sqrt(). I realize that you made StrictMath the default math library--is there a way to run the faster 1.3.1 math routines?

JUSTIFICATION :
The scientific community wants fast math functions with JDK 1.4.

EXPECTED VERSUS ACTUAL BEHAVIOR :
Math.sin(), Math.cos(), and Math.sqrt() to be as fast under JDK 1.4 as JDK 1.3
Math.sin(), Math.cos(), and Math.sqrt() are much slower under JDK 1.4 as JDK 1.3

---------- BEGIN SOURCE ----------
public class Benchmark
{
  public static void main(String[] args)
  {
    System.out.println("Benchmark test..." + new java.util.Date());
    double d = 0.5;
    for (int i=0; i < 20000000; i++)
    {
      d = (d * i) / 3.1415926;
      d = Math.sin(d);
      d = Math.cos(d);
      d = Math.sqrt(d) * 3.1415926;
    }
    System.out.println("Benchmark done. Time is " + new java.util.Date());
    System.out.println("D=" + d);
  }
}

---------- END SOURCE ----------

(Review ID: 184982)
======================================================================

Comments
EVALUATION We are aware of the issue raised by this bug; it is the same issue raised in 4807358, the "almabench" bug. Both this bug and 4807358 are symptoms of the same problem: the sin/cos routines in the client compiler under 1.3.1 did *not* comply with the Math.{sin/cos} specification. As explained below, the way they did not comply could be the source of a large amount of error. As a consequence of fixing this compliance problem (bug 4345903), the x86 sin/cos routines in 1.4 and later are slower than in 1.3.1 client. The x87 FPU has fsin and fcos instructions to accelerate the computation of sine and cosine. However, these instructions have a number of limitations. First, they only return sensible results over a limited range of values, +/- 2^63. Java's sin/cos functions are defined over the full double range, roughly +/- 2^1023. Second, even within the +/-2^63 range, nearly all fsin/fcos implementations perform faulty *argument reduction,* consequently, the results can be very wrong outside of a narrow range of +/- pi/4. The way many math libraries routines work is to map any argument into an "equivalent" argument in a fixed, narrow range; this process is called argument reduction. Since sine and cosine are periodic, the basic idea behind the mapping is clear; subtract out multiples of 2*pi. The actual details are a little more sophisticated; in the case of sin/cos, arguments are mapped into [-pi/4, pi/4] using sign symmetries between sin/cos (for a good current reference on such matters see Jean-Michel Muller's book "Elementary Functions"). The number pi is transcendental; it has no finite binary representation. Around 1982 several years after the original x87 design, techniques for doing argument reduction as if by an exact value of pi were developed (see http://www.validlab.com/arg.pdf for a discussion of the techniques). The fsin/fcos instructions use an older technique: pretend pi has a fixed number of bits; most x87 implementations use a 66-bit approximation of pi for argument reduction. If the sine/cosine functions are viewed a spring, using a limited precision approximation to pi amounts to slightly compressing the spring. Even for moderately large arguments, the difference between the compressed and uncompressed versions of sin/cosine can be very, very large, affecting even the sign of the result. People are often unnerved when they discover that 1.0/10.0 cannot be exactly represented as a binary floating-point number. The exact value of the floating-point result of dividing 1 by 10 is 0.1000000000000000055511151231... which is accurate to 17 decimal places. In floating-point parlance, this is accurate to 1/2 ulp (unit in the last place), which is the most one can ask from a single floating-point operation. The value of sine for the floating-point number Math.PI is around 1.2246467991473532E-16 while the computed value for the algorithm used in java 1.3.1 is 1.2246063538223773E-16 ^ In other words, the returned result is only accurate to about 5 digits instead of the full 15-17 digit precision of double. Instead of a 1/2 ulp or 1 ulp error, the error is about 1.64e11 ulps, over *ten billion* ulps. Does this magnitude of error matter to your application? It might, or it might not. However, in case it does matter, the Java specification requires an accurate answer be returned. If you don't care what is returned, why do you care how fast it is returned? Rhetorically, sin could always return 0 and cos could always return 1; they would be very fast but not too useful as sine and cosine functions. (Neither the almabench code nor the code submitted in this bug actually examine the results to verify they are sensible.) On a related note, the sine and cosine of large values, even as large as Double.MAX_VALUE, are perfectly well-defined mathematically. There is no mathematical justification for returning arbitrary sin/cos results for large values. In 1.4, an effort was made to take advantage of the fsin/fcos hardware as much as we could and still meet the spec. However, "large" arguments outside of around pi/4 require a separate argument reduction step, which can be the bulk of the cost. Closing as not a bug. ###@###.### 2003-05-01
01-05-2003