JDK-8242108 : Performance regression after fix for JDK-8229496
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11.0.6-oracle,13.0.2,14,15
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-03-22
  • Updated: 2020-12-08
  • Resolved: 2020-04-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 15
11.0.8-oracleFixed 13.0.4Fixed 15 b20Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
CPU: Intel 8700k
RAM: 64GB
Win10 x64 1909 / 18363.720

VM version: JDK 14, OpenJDK 64-Bit Server VM, 14+36-1461
VM version: JDK 13.0.2, OpenJDK 64-Bit Server VM, 13.0.2+8


A DESCRIPTION OF THE PROBLEM :
Using ojdk14 gives a performance penalty around 5 to 15% using NumberFormat.format.

JMH Benchmark results:

# JMH version: 1.23
# VM version: JDK 14, OpenJDK 64-Bit Server VM, 14+36-1461
# VM invoker: C:\Program Files\Java\ojdk-14\bin\java.exe
# VM options: -Dfile.encoding=UTF-8 --module-path=E:\Profile\eclipse\jdk11\de.sph.benchmark3\target\classes;C:\Users\Stefa\.m2\repository\org\openjdk\jmh\jmh-core\1.23\jmh-core-1.23.jar -Djdk.module.main=de.sph.benchmark3
# Warmup: 5 iterations, 5 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: de.sph.benchmark3.DefFormatterBench.testDefNumberFormatter

Benchmark                                   (value)   Mode  Cnt      Score     Error   Units
DefFormatterBench.testDefNumberFormatter       1.23  thrpt    5   6020,498 ���� 315,527  ops/ms
DefFormatterBench.testDefNumberFormatter       1.49  thrpt    5   5981,606 ���� 185,040  ops/ms
DefFormatterBench.testDefNumberFormatter       1.80  thrpt    5   6894,794 ���� 252,732  ops/ms
DefFormatterBench.testDefNumberFormatter        1.7  thrpt    5   6760,995 ���� 251,076  ops/ms
DefFormatterBench.testDefNumberFormatter        0.0  thrpt    5  10398,507 ���� 225,129  ops/ms
DefFormatterBench.testDefNumberFormatter      -1.49  thrpt    5   5572,113 ����  47,249  ops/ms
DefFormatterBench.testDefNumberFormatter      -1.50  thrpt    5   6223,956 ���� 140,064  ops/ms
DefFormatterBench.testDefNumberFormatter  9999.9123  thrpt    5   3534,608 ����  40,504  ops/ms
DefFormatterBench.testDefNumberFormatter      1.494  thrpt    5   5314,769 ����  23,540  ops/ms
DefFormatterBench.testDefNumberFormatter      1.495  thrpt    5   5503,352 ����  29,010  ops/ms
DefFormatterBench.testDefNumberFormatter       1.03  thrpt    5   6094,022 ���� 404,048  ops/ms
DefFormatterBench.testDefNumberFormatter     25.996  thrpt    5   4653,838 ����  72,346  ops/ms
DefFormatterBench.testDefNumberFormatter    -25.996  thrpt    5   4408,712 ����  33,991  ops/ms

-------------------
# JMH version: 1.23
# VM version: JDK 13.0.2, OpenJDK 64-Bit Server VM, 13.0.2+8
# VM invoker: C:\Program Files\Java\ojdk-13.0.2\bin\java.exe
# VM options: -Dfile.encoding=UTF-8 --module-path=E:\Profile\eclipse\jdk11\de.sph.benchmark3\target\classes;C:\Users\Stefa\.m2\repository\org\openjdk\jmh\jmh-core\1.23\jmh-core-1.23.jar -Djdk.module.main=de.sph.benchmark3
# Warmup: 5 iterations, 5 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: de.sph.benchmark3.DefFormatterBench.testDefNumberFormatter

Benchmark                                   (value)   Mode  Cnt      Score      Error   Units
DefFormatterBench.testDefNumberFormatter       1.23  thrpt    5   6758,121 ����   22,820  ops/ms
DefFormatterBench.testDefNumberFormatter       1.49  thrpt    5   6696,395 ����  649,326  ops/ms
DefFormatterBench.testDefNumberFormatter       1.80  thrpt    5   7526,417 ����   70,981  ops/ms
DefFormatterBench.testDefNumberFormatter        1.7  thrpt    5   7216,894 ����  751,324  ops/ms
DefFormatterBench.testDefNumberFormatter        0.0  thrpt    5  10974,451 ���� 1662,661  ops/ms
DefFormatterBench.testDefNumberFormatter      -1.49  thrpt    5   6236,206 ����  451,509  ops/ms
DefFormatterBench.testDefNumberFormatter      -1.50  thrpt    5   6669,078 ����  485,808  ops/ms
DefFormatterBench.testDefNumberFormatter  9999.9123  thrpt    5   4156,953 ����   25,838  ops/ms
DefFormatterBench.testDefNumberFormatter      1.494  thrpt    5   6126,089 ����   61,211  ops/ms
DefFormatterBench.testDefNumberFormatter      1.495  thrpt    5   6025,925 ����  447,987  ops/ms
DefFormatterBench.testDefNumberFormatter       1.03  thrpt    5   6916,877 ����   26,122  ops/ms
DefFormatterBench.testDefNumberFormatter     25.996  thrpt    5   5363,537 ����   16,783  ops/ms
DefFormatterBench.testDefNumberFormatter    -25.996  thrpt    5   4931,153 ����   35,624  ops/ms
-------------------------------------------
Test code:
package de.sph.benchmark3;

import java.text.NumberFormat;
import java.util.Locale;
import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 10, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@State(Scope.Benchmark)
public class DefFormatterBench {

	@Param({ "1.23", "1.49", "1.80", "1.7", "0.0", "-1.49", "-1.50", "9999.9123", "1.494", "1.495", "1.03", "25.996",
			"-25.996" })
	public double value;

	private DefNumerFormat dnf = new DefNumerFormat();

	@Benchmark
	public void testDefNumberFormatter(final Blackhole blackhole) {
		blackhole.consume(this.dnf.format(this.value));
	}

	public static void main(String... args) throws Exception {

		Options opts = new OptionsBuilder().include(DefFormatterBench.class.getSimpleName()).shouldDoGC(true).build();

		new Runner(opts).run();
	}

	private static class DefNumerFormat {
		private final NumberFormat n;

		public DefNumerFormat() {
			this.n = NumberFormat.getInstance(Locale.ENGLISH);
			this.n.setMaximumFractionDigits(2);
			this.n.setMinimumFractionDigits(2);
			this.n.setGroupingUsed(false);
		}

		public String format(final double d) {
			return this.n.format(d);
		}
	}
}




Comments
Fix request (13u): I would like to backport this fix to 13u for parity with 11u. The original change applies cleanly.
09-06-2020

jdk11 backport request I would like to have the change in OpenJDK11 as well, because the issue is present there too. The patch applies cleanly.
04-05-2020

URL: https://hg.openjdk.java.net/jdk/jdk/rev/573076e3c2d0 User: thartmann Date: 2020-04-22 14:23:15 +0000
22-04-2020

The fix for 8229496 [1] triggers a performance regression with NumberFormat.format(). The problem is the additional control dependency on a CastII/LL which restricts optimizations due to _carry_dependency being set (which was necessary because we can not represent non-null integers/long values in C2's type system). While investigating, I've noticed that Roland's fix for JDK-8241900 [2] fixes the exact same problem but in a more elegant way, avoiding an impact on performance. I'm therefore proposing to back out the original fix for JDK-8229496, leaving the regression test in and also adding a microbenchmark. I've verified that this solves the performance regression (4547 ops/ms vs. 5048 ops/ms on my machine): http://cr.openjdk.java.net/~thartmann/8242108/webrev.00/ [1] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-August/034865.html [2] https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037778.html
20-04-2020

ILW = Performance regression after fix for JDK-8229496, only shows up with targeted microbenchmark, no known workaround = MMH = P3
07-04-2020

OS windows 10 Versions checked: jdk 14, jdk 13.0.2, jdk 11.0.6 End results looks similar in all the versions.
03-04-2020