Bug ID: JDK-6506405 Math.abs(float) is slow

Type: Enhancement
Component: core-libs
Sub-Component: java.lang
Affected Version: 6

Priority: P5
Status: Resolved
Resolution: Fixed
OS: windows_2000
CPU: x86

Submitted: 2006-12-19
Updated: 2021-07-19
Resolved: 2021-07-14

JDK 18
18 b06Fixed

A DESCRIPTION OF THE REQUEST :
The routine Math.abs(float) is unnecessarily slow.  This is related to my prior bug
5108893 which was that Math.abs was slow.  It was fixed but apparently only for doubles as Math.abs(float) is still much slower than it should be.  At this point it is actually faster to convert a float to a double, call Math.abs(double) and then convert the result back to a float than it is to call Math.abs(float) directly.

The fix should be simple, just apply the same optimization that was used to big bug 5108893 to similarly intrinsify Math.abs(float)

JUSTIFICATION :
Math.abs is very commonly used and there is no reason for Math.abs(float) should be significantly slower than calling Math.abs(double).  Speeding it up will help any code that uses Math.abs(float).

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Cost of calling Math.abs(float) should be about the same as performing an add, as it is now for Math.abs(double).  Instead it is much more expensive and calling

float a, b;

b = (float) Math.abs((double)a);

is actually faster than

b = Math.abs(a);

as shown in the timings below
ACTUAL -
java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode)

avg    1.5 ns   total 6.10E-1 s  for assign             (~ 2.6 cycles)
avg    3.8 ns   total 1.50E0 s   for add                (~ 6.4 cycles)
avg   15.9 ns   total 6.36E0 s   for Math.abs()         (~ 27.0 cycles)
avg    8.8 ns   total 3.50E0 s   for (float)Math.abs((double))  (~ 14.9 cycles)
avg    1.5 ns   total 5.94E-1 s  for assign             (~ 2.5 cycles)
avg    1.7 ns   total 6.71E-1 s  for add                (~ 2.9 cycles)
avg   16.0 ns   total 6.39E0 s   for Math.abs()         (~ 27.2 cycles)
avg    8.7 ns   total 3.50E0 s   for (float)Math.abs((double))  (~ 14.9 cycles)

---------- BEGIN SOURCE ----------

import java.text.DecimalFormat;
import java.util.Random;

/** Test that show that Math.abs on floats is much slower than a floating point add
 * even though it should have about the same cost
 *
 * @author  bjw  Bruce Walter, Cornell Program of Computer Graphics 2004
 */
public class AbsFloatTest {
  //target total number of repetitions of the operation
  public static final int opTargetRepetitions = 400000000;
  //size of arrays that are operated on
  public static final int arraySize = 10000;
  //number of times we need to process each array to reach total target count
  public static final int reps = opTargetRepetitions/arraySize;
  //pretty print the output numbers to make them easier to read
  public static final DecimalFormat decForm = new DecimalFormat("###0.0");
  public static final DecimalFormat sciForm = new DecimalFormat("0.00E0");
  //my processor is a 1.7GHz Xeon (actually it is a dual processor, but this test is single threaded)
  public static final double ghzProcSpeed = 1.7; //my processor is 1.7GHz
    
  public static void runTimingTest(TestOp op, float result[], float src[], boolean print) {
    long time = System.currentTimeMillis();
    for(int i=0;i<reps;i++) {
      op.performOp(result,src);
    }
    time = System.currentTimeMillis() - time;
    double denom = 1000000.0/(reps*src.length);
    if (print) {
      String ps = decForm.format(time*denom);
      while (ps.length()<6) ps = " "+ps;
      ps = "avg "+ps+" ns   total "+sciForm.format(time/1000.0)+" s";
      while (ps.length()<32) ps += " ";
      ps = ps+" for "+op.toString();
      while (ps.length()<50) ps += " ";
      System.out.println(ps+"\t(~ "+decForm.format(time*denom*ghzProcSpeed)+" cycles)");
    }
  }
    
  public static void main(String[] args) throws InterruptedException {
    float src[] = new float[arraySize];
    float result[] = new float[arraySize];
    Random ran = new Random(5232482349538L);
    //set the src array to be random values between -1 and 1 (but excluding zero)
    for(int i=0;i<src.length;i++) {
      do {
        src[i] = 2*ran.nextFloat() - 1.0f;
      } while (src[i] == 0);
    }
    TestOp tests[] = { new AssignOp(), new AddOp(), new AbsOp(), new AbsViaDoubleOp()};
    //warm up hotspot
    for(int i=0;i<tests.length;i++) {
      runTimingTest(tests[i],result,src,false);
    }
    //now run the real tests and print the timings
    for(int i=0;i<tests.length;i++) {
      runTimingTest(tests[i],result,src,true);
    }
    //do it again to show the timings are reasonably consistent
    for(int i=0;i<tests.length;i++) {
      runTimingTest(tests[i],result,src,true);
    }
  }
  
  public abstract static class TestOp {
    public abstract void performOp(float result[], float src[]);
  }
  
  public static class AssignOp extends TestOp {
    public String toString() { return "assign"; }
    public void performOp(float result[], float src[]) {
      for(int i=0;i<src.length;i++) {
        result[i] = src[i];
      }
    }
  }
  
  public static class AddOp extends TestOp {
    public String toString() { return "add"; }
    public void performOp(float result[], float src[]) {
      for(int i=0;i<src.length;i++) {
        result[i] = 0.143f+src[i];
      }
    }
  }
  
  public static class AbsOp extends TestOp {
    public String toString() { return "Math.abs()"; }
    public void performOp(float result[], float src[]) {
      for(int i=0;i<src.length;i++) {
        result[i] = Math.abs(src[i]);
      }
    }
  }
    
  public static class AbsViaDoubleOp extends TestOp {
    public String toString() { return "(float)Math.abs((double))"; }
    public void performOp(float result[], float src[]) {
      for(int i=0;i<src.length;i++) {
        result[i] = (float)Math.abs((double)src[i]);
      }
    }
  }
 
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Using casting to double, calling Math.abs(double), and then casting back to float is somewhat faster than Math.abs(float) currently, but is somewhat ugly coding.

Changeset: c0d4efff Author: Brian Burkhalter <bpb@openjdk.org> Date: 2021-07-14 15:50:51 +0000 URL: https://git.openjdk.java.net/jdk/commit/c0d4efff3c7b853cd663726b668d49d01e0f8ee0
14-07-2021
Math.abs(float) is now marked for an intrinsic: @IntrinsicCandidate public static float abs(float a) {}
07-07-2021
The attached JMH class AbsBench gives the following results on JDK10-internal: OEL 7 Benchmark Mode Cnt Score Error Units AbsBench.absFloat avgt 10 35.334 �� 0.051 ns/op AbsBench.absFloatViaAbsDouble avgt 10 28.582 �� 0.013 ns/op Ubuntu 16.04 VM Benchmark Mode Cnt Score Error Units AbsBench.absFloat avgt 10 3.264 �� 0.057 ns/op AbsBench.absFloatViaAbsDouble avgt 10 3.327 �� 0.032 ns/op macOS Benchmark Mode Cnt Score Error Units AbsBench.absFloat avgt 10 3.285 �� 0.048 ns/op AbsBench.absFloatViaAbsDouble avgt 10 3.378 �� 0.040 ns/op Windows VM Benchmark Mode Cnt Score Error Units AbsBench.absFloat avgt 10 3.258 �� 0.248 ns/op AbsBench.absFloatViaAbsDouble avgt 10 3.333 �� 0.052 ns/op In three out of four cases the two rounding pathways show a minimal difference with the more direct float-only pathway appearing slightly faster. This issue could likely be resolved as Cannot Reproduce or Not An Issue.
28-07-2017
I recommend verifying whether or not this problem still exist. Math.abs(double) is marked @HotSpotIntrinsicCandidate but Math.abs(float) is not. However, the non-intrinsic execution the abs method is just a compare followed by either a return or a subtract and return so it shouldn't be that slow in either case.
23-08-2016
EVALUATION A reasonable request; will investigate.
20-12-2006

Relates :	JDK-5108893 - Math.abs() is slow
Relates :	JDK-6296690 - Math.round() is extremely slow
Relates :	JDK-8270476 - Make floating-point test infrastructure more lambda and method reference friendly