JDK-8302976 : C2 Intrinsification of Float.floatToFloat16 and Float.float16ToFloat Yields Different Result than the Interpreter
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 20,21
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • CPU: x86,aarch64
  • Submitted: 2023-02-21
  • Updated: 2024-01-26
  • Resolved: 2023-03-09
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 20 JDK 21
20.0.2Fixed 21 b14Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8303035 :  
Description
The following code snippet prints twice 31745 with the interpreter, and 32257 31745 with -Xcomp -XX:-TieredCompilation. This means, C2 will eliminate both intrinsifications if they are within the same compilation unit (the latter print), and return the same result as the interpreter. But if it applies the intrinsifications respectively in different compilation units and employ the underlying vcvtps2ph and vcvtph2ps instructions (the former print), then it will return a different value. 


public class Foo {
  public static short bar(float f) {
    return Float.floatToFloat16(f);
  }
  public static void main(String[] args) {
    System.out.println(bar(Float.float16ToFloat((short) 31745)));
    System.out.println(Float.floatToFloat16(Float.float16ToFloat((short) 31745)));
  }
} 

Comments
A pull request was submitted for review. URL: https://git.openjdk.org/jdk20u/pull/21 Date: 2023-03-21 21:46:36 +0000
21-03-2023

jdk20u backport request This fixed inconsistency in Float.floatToFloat16 and Float.float16ToFloat results found in JDK 20 but is was late to push it there. Risk is medium since changes are not small. But it was running in JDK 21 for 2 weeks already without any issues. New tests were added to make sure the fix is correct. JDK 20u PR: https://github.com/openjdk/jdk20u/pull/21
21-03-2023

Changeset: 8cfd74f7 Author: Vladimir Kozlov <kvn@openjdk.org> Date: 2023-03-09 03:26:38 +0000 URL: https://git.openjdk.org/jdk/commit/8cfd74f76afc9e5d50c52104fef9974784718dd4
09-03-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/12869 Date: 2023-03-03 21:41:35 +0000
06-03-2023

Thanks a lot [~kvn] [~darcy]. The sharedRuntime:hf2f and f2hf implementations will also need to be platform dependent for NaN handling.
25-02-2023

The test in question contains the following comment: /* * The behavior tested below is an implementation property not * required by the specification. It would be acceptable for this * information to not be preserved (as long as a NaN is returned) if, * say, a intrinsified version using native hardware instructions * behaved differently. * * If that is the case, this test should be modified to disable * intrinsics or to otherwise not run on platforms with an differently * behaving intrinsic. */ The test has already been updated to limit the platforms on which it is run: * @requires (os.arch != "x86" & os.arch != "i386") | vm.opt.UseSSE == "null" | vm.opt.UseSSE > 0
25-02-2023

I will let [~darcy] say what he think about this.
25-02-2023

Okay, I got implementation which executes the same HW instructions in Interpreter, C1 and C2. So results will be consistent but they are different from what Java implementation and corresponding tests expect - roundtrip equality for float16.
25-02-2023

That is also because of the QNaN bit and needs similar change on line 84: From: if (s != s2) { To: if ((s & ~0x0200) != (s2 & ~0x0200)) { // ignore QNaN bit
25-02-2023

I also ran https://github.com/openjdk/jdk/blob/master/test/jdk/java/lang/Float/Binary16ConversionNaN.java [~darcy] wrote and it failed too: Roundtrip failure on NaN value 7c01 got back 7e01 Roundtrip failure on NaN value fc01 got back fe01 Roundtrip failure on NaN value 7c02 got back 7e02 Roundtrip failure on NaN value fc02 got back fe02 ... Exception in thread "main" java.lang.RuntimeException: 1022 errors at Binary16ConversionNaN.main(Binary16ConversionNaN.java:51) It does not fail with Java implemetation - run with -Xint without my code.
25-02-2023

[~kvn] The QNaN bit difference is not an issue and similar thing happens today with float->double conversion and back. I mentioned this in my comment earlier that the behavior of adding the QNaN bit is consistent across float16/float/double data types. To show this, I translated the TestAll.java for float->double in file TestAllFloat.java. We see similar handling that the results for NaN differ by the QNaN bit in TestAllFloat.java as well. In my thoughts the TestAll.java needs to be fixed as TestAllFix.java to be the correct test.
25-02-2023

Is this is why we have special stub to fix value after conversion between float and integer values? : https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L9424
25-02-2023

I implemented usage of vcvtps2ph/vcvtph2ps instructions in Interpreter and run with TestAll.java. All NaN values failed: Inconsistent result for Float.floatToFloat16(-1023): -511 != -1023 (NaN) Inconsistent result for Float.floatToFloat16(-1022): -510 != -1022 (NaN) Inconsistent result for Float.floatToFloat16(-1021): -509 != -1021 (NaN) Inconsistent result for Float.floatToFloat16(-1020): -508 != -1020 (NaN) ... Inconsistent result for Float.floatToFloat16(31745): 32257 != 31745 (NaN) Inconsistent result for Float.floatToFloat16(31746): 32258 != 31746 (NaN) Inconsistent result for Float.floatToFloat16(31747): 32259 != 31747 (NaN) Inconsistent result for Float.floatToFloat16(31748): 32260 != 31748 (NaN) [~sviswanathan] The test execution simply push short value through vcvtph2ps->vcvtps2ph instructions and got back different result if intermittent float value is NaN. The difference is 512 as this bug stated. I really don't want to check for special NaN cases in assembler code and adjust result for that. Do you know other solutions on assembler level?
25-02-2023

Similar issue from the past: JDK-8076373
24-02-2023

I took it.
23-02-2023

[~dlong], right, good catch. We need a more sophisticated test.
23-02-2023

[~thartmann], I was wrong, we can test all the input values for float16ToFloat(), but not so easily for floatToFloat16(), with 2^32 inputs to test. I posted a counter example here that demonstrates the problem: https://github.com/openjdk/jdk/pull/12704#issuecomment-1441020334
23-02-2023

[~dlong]: > It looks like the problem for this test is NaN values, but there could be other values that cause problems. I think we need a test for these intrinsics that covers all the interesting NaN, subnormal, 0, -0 values. Or better yet, all values. There's only 65536 to test, right? Yes, we should do that. I attached a corresponding test (TestAll.java). Both Float.floatToFloat16 and Float.float16ToFloat are affected. Input values in the ranges [-1023, -513] and [31745, 32255] ([0xfc01, 0xfdff], [0x7c01, 0x7dff]) corresponding to NaN return inconsistent values.
22-02-2023

ILW = Wrong result of C2 intrinsic, new Float.floatToFloat16 method added by JDK-8289551 (intrinsified by JDK-8289552), -XX:DisableIntrinsic=_floatToFloat16 = HMM = P2
22-02-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/12704 Date: 2023-02-22 02:08:27 +0000
22-02-2023

[~thartmann], [~kvn], [~dlong], [~darcy] This is specific to NaN handling. The HW instructions generate QNaNs and not SNaNs for floating point instructions. This happens across double, float, and float16 data types. The most significant bit of mantissa is set to 1 for QNaNs. For Float16 data type (1 sign bit, 5 exponent bits, 10 mantissa bits): 0x7E01 would be a QNaN and 0x7C01 would be a SNaN. https://en.wikipedia.org/wiki/NaN#Encoding As per my understanding we need to change the java/lang/float.java and the corresponding shared runtime constant expression evaluation to generate QNaN. I can send out a PR fixing this. Please let me know your thoughts.
22-02-2023

[~dlong] I agree with you. In any case, I have submitted a PR for the mainline. https://github.com/openjdk/jdk/pull/12704
22-02-2023

[~sviswanathan], I'm not sure the behavior is a bug. The javadoc for Float.float16ToFloat() only says: "If the argument is a NaN, the result is a NaN." It does not say what kind of NaN, or even that the same NaN will be returned for the same input. This is in contrast to floatToIntBits(), which returns a canonical NaN value.
22-02-2023

I got the same result on aarch64. It looks like RISC-V also implements this intrinsic.
22-02-2023

I think the problem is the C++ and Java implementations return a different NaN value than the HW instruction. https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Float.java#L1096-L1106 HW instruction returns a NaN value, but not the same one as the reference code. If we need to be bit-for-bit the same, then it implies we can't use the HW instruction without a runtime check for NaN first (when the value is not a constant).
22-02-2023

I wish we don't use runtime function for that but simple JIT compile Java method. Or always use HW instruction. Note, runtime function is used only when HW instructions are present and intrinsic is enabled. Otherwise we JIT compile Java method.
21-02-2023

Note, in this bug case NaN constant is used as argument and HW instruction is NOT used. Instead `SharedRuntime::hf2f()` runtime function is called to get value: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/convertnode.cpp#L239
21-02-2023

Yes, 31745 is 0x7C01 NaN value. So HW instruction does not handle NaN value correctly? Or there is other bug.
21-02-2023

It looks like the problem for this test is NaN values, but there could be other values that cause problems. I think we need a test for these intrinsics that covers all the interesting NaN, subnormal, 0, -0 values. Or better yet, all values. There's only 65536 to test, right?
21-02-2023

Smita, could you please have a look?
21-02-2023

Thanks for the report, [~yzheng]. I attached a reproducer that does not require any command line options.
21-02-2023