JDK-8189230 : JDK method:java.lang.Integer.numberOfLeadingZeros(int) can be optimized
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 9
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2017-10-12
  • Updated: 2025-02-07
  • Resolved: 2018-03-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b05Fixed
Related Reports
Duplicate :  
Description
A DESCRIPTION OF THE REQUEST :
the origin code is :

public static int numberOfLeadingZeros(int i) {
    // HD, Figure 5-6
    if (i == 0)
        return 32;
    int n = 1;
    if (i >>> 16 == 0) { n += 16; i <<= 16; }
    if (i >>> 24 == 0) { n +=  8; i <<=  8; }
    if (i >>> 28 == 0) { n +=  4; i <<=  4; }
    if (i >>> 30 == 0) { n +=  2; i <<=  2; }
    n -= i >>> 31;
    return n;
}
I think it can be optimized ,should added following condition:

if (i < 0)
        return 0;
the fully optimized code is :

public static int numberOfLeadingZeros(int i) {
    // HD, Figure 5-6
    if (i == 0)
        return 32;
    if (i < 0)
        return 0;
    int n = 1;
    if (i >>> 16 == 0) { n += 16; i <<= 16; }
    if (i >>> 24 == 0) { n +=  8; i <<=  8; }
    if (i >>> 28 == 0) { n +=  4; i <<=  4; }
    if (i >>> 30 == 0) { n +=  2; i <<=  2; }
    n -= i >>> 31;
    return n;
}

JUSTIFICATION :
In some condition, the intrinsic function may not called, the default java implementation will be run. then the original code will be slower than the optimized code.



Comments
URL: http://hg.openjdk.java.net/jdk/jdk/rev/3c0a12972165 User: bpb Date: 2018-03-15 15:17:21 +0000
15-03-2018

The measurements observed for the attached JMH benchmark indicate that a performance improvement of 20% to 33% may be obtained via this optimization for the case of non-intrinsic operation. For purposes of the benchmark, the {Integer,Long}.numberOfLeadingZeros() methods were not invoked directly but instead copied to the benchmark source which removes the effect of an extra method invocation for those cases and suppresses the effect of any intrinsic optimizations which might be present on the platform. Sample measurements for JDK 9.0.1 GA on macOS 10.9.5 and JDK 1.8.0_151 GA Linux (Xubuntu 16.04) are given here: --- macOS --- Benchmark Mode Cnt Score Error Units LeadingZerosBenchmark.testNegativeJDK thrpt 20 272.645 ± 1.006 ops/us LeadingZerosBenchmark.testNegativeLongJDK thrpt 20 251.185 ± 0.820 ops/us LeadingZerosBenchmark.testNegativeLongOpt2 thrpt 20 327.871 ± 0.804 ops/us LeadingZerosBenchmark.testNegativeOpt thrpt 20 363.008 ± 3.764 ops/us LeadingZerosBenchmark.testNegativeOpt2 thrpt 20 362.534 ± 3.253 ops/us LeadingZerosBenchmark.testPositiveOpt thrpt 20 241.165 ± 9.267 ops/us LeadingZerosBenchmark.testPositiveOpt2 thrpt 20 255.472 ± 7.938 ops/us LeadingZerosBenchmark.testZero thrpt 20 424.838 ± 1.804 ops/us LeadingZerosBenchmark.testZero2 thrpt 20 425.054 ± 1.278 ops/us --- Linux --- Benchmark Mode Cnt Score Error Units LeadingZerosBenchmark.testNegativeJDK thrpt 20 82.850 ± 0.049 ops/us LeadingZerosBenchmark.testNegativeLongJDK thrpt 20 76.328 ± 5.239 ops/us LeadingZerosBenchmark.testNegativeLongOpt2 thrpt 20 99.375 ± 0.079 ops/us LeadingZerosBenchmark.testNegativeOpt thrpt 20 99.022 ± 8.706 ops/us LeadingZerosBenchmark.testNegativeOpt2 thrpt 20 99.351 ± 8.948 ops/us LeadingZerosBenchmark.testPositiveOpt thrpt 20 78.660 ± 3.614 ops/us LeadingZerosBenchmark.testPositiveOpt2 thrpt 20 78.894 ± 2.080 ops/us LeadingZerosBenchmark.testZero thrpt 20 116.959 ± 0.071 ops/us LeadingZerosBenchmark.testZero2 thrpt 20 116.896 ± 0.130 ops/us
13-03-2018

As general comments for this situation, jmh should be used to validate microbench results. If a method has intrinsics on platforms of interest, there is reduced incentive to optimizing the non-intrinsic implementation.
27-10-2017

Additional Information from submitter: : I optimized the Test code to make it more convincible: public class TestForNumberOfLeadingZeros { public static void main(String[] args) { long startTimeOfOptMethod = System.currentTimeMillis(); for(int i=-1;i>Integer.MIN_VALUE;i--) { optNumberOfLeadingZeros(i); } System.out.println("optimized method took time in millis:" + (System.currentTimeMillis() - startTimeOfOptMethod)); long startTimeOfOriginMethod = System.currentTimeMillis(); for(int i=-1;i>Integer.MIN_VALUE;i--) { Integer.numberOfLeadingZeros(i); } System.out.println("origin method took time in millis:" + (System.currentTimeMillis() - startTimeOfOriginMethod)); } public static int optNumberOfLeadingZeros(int i) { // HD, Figure 5-6 if (i == 0) return 32; if (i < 0) return 0; int n = 1; if (i >>> 16 == 0) { n += 16; i <<= 16; } if (i >>> 24 == 0) { n += 8; i <<= 8; } if (i >>> 28 == 0) { n += 4; i <<= 4; } if (i >>> 30 == 0) { n += 2; i <<= 2; } n -= i >>> 31; return n; } Disable intrinsic method _numberOfLeadingZeros_i when run the test: java -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_numberOfLeadingZeros_i TestForNumberOfLeadingZeros here is the test result in my ownMacBook: optimized method took time in millis:567 origin method took time in millis:3392
23-10-2017

Better performance should be proved of course. When benchmarking the routine, it may make sense to combine the checks (i == 0) and (i < 0) to avoid one branch for the positive numbers: if (i <= 0) return i == 0 ? 32 : 0;
12-10-2017