JDK-8276453 : Undefined behavior in C1 LIR_OprDesc causes SEGV in fastdebug build
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 6,7,8,9,10,11,12,13,14,15,16,17,18
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2021-11-02
  • Updated: 2024-02-06
  • Resolved: 2021-11-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 18
18 b24Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
Building HotSpot with LLVM 13 fails due to SEGV at  FrameMap::java_calling_convention(GrowableArray<BasicType> const*, bool). See attached log.

The root cause is the following code in LIR_OprDesc:
  // Conversion
  intptr_t value() const                         { return (intptr_t) this; }

This pattern is a source of undefined behavior in C++. This is similar to the issue with markOop/markOopDesc that was fixed in JDK-8229258.

In this case, the undefined behavior happens with:
  bool check_value_mask(intptr_t mask, intptr_t masked_value) const {
    return (value() & mask) == masked_value;
  }
Compiler can make various optimizations if it believes "this" pointer is aligned. The LLVM optimization that tripped this is https://reviews.llvm.org/rG16d03818412415c56efcd482d18c0cbdf712524c , which is a roll-forward of https://reviews.llvm.org/D99790.

Properly fixing this requires wide refactoring of the use cases of LIR_OprDesc.

Colleagues James Y Knight and [~rasbold] have contributed a workaround patch with some hacks to reduce the patch size. I will publish this partial fix to collect initial feedback, and see if we could proceed with the patch, or if there are volunteers to pick up this work.
Comments
I just didn't know if the issue is reproducible (or fixed) for ARM until I tried. And now when I've fixed ARM build for Clang compiler I do not see any reason why not to incoporate the fix to Open JDK to provide one more toolkit (currently the only recommended way to build ARM is GCC cross-build)
29-04-2022

Ok, I don't have experience for cross-build for ARM-32, and our internal JDK does not support ARM-32. I'm not sure we have to solve the "cross-build for ARM-32 with Clang-13+" problem. Is it OK that as long as this change does not break cross-build for ARM-32 with GCC, we are fine? Is there a real use case that requires using LLVM/Clang for cross-build for ARM-32?
28-04-2022

You're telling about Aarch64, but I meant ARM-32. OpenJDK covers them with absolutely different architecture-specific snipsets. I'm going to test all the architectures with native builds on Travis CI, that provides all required environments except ARM-32, and that is why I was tied up with ARM cross-build.
28-04-2022

My cross-build setup: https://github.com/caoman/openjdk_scripts/blob/master/aarch64_cross_build_with_llvm.md.
28-04-2022

I have set up cross-build for 11u for ARM with LLVM-13.0.1. I have to directly set autoconf variables CC, CXX, BUILD_CC, BUILD_CXX, AR, BUILD_AR, NM, BUILD_NM and a few others, so they use clang, clang++, and llvm's toolchain such as llvm-ar, llvm-nm. OpenJDK's configure rule does not seem to directly support for cross-build with Clang. I did not run into the ARM Assembler syntax problem. However, I ran into another build failure that is fixed by JDK-8229258. The error is: # Internal Error (/usr/local/google/home/manc/ws/jdk11udev/src/hotspot/share/memory/virtualspace.cpp:632), pid=2084863, tid=2084869 # assert(markOopDesc::encode_pointer_as_mark(_base)->decode_pointer() == _base) failed: area must be distinguishable from marks for mark-sweep We backported JDK-8229258 to our internal JDK a while ago, but it is not yet backported to 11u. In order for 11u to build with LLVM-13+, it needs to at least backport this change and JDK-8229258. Let me share my cross-build setup for ARM with LLVM-13.0.1 somewhere, and hopefully it can resolve the cross-build issues you see.
28-04-2022

Thanks for the clarification! At the time I was working on this change, I only tested building with LLVM-13 for x86 locally. So LLVM-13+ for x86->ARM cross-build was not tested. However, recently our internal JDK (based on JDK 11) has started supporting x86->ARM cross-build with LLVM, and it is built with bleeding-edge LLVM-15 (currently based on Git commit 5c116d50e42f93ef4fa9a9719121378ec6b99b69 on 03/25/2022). So let me check if I can do the x86->ARM cross-build for JDK 11u with LLVM-13 or LLVM-15+.
26-04-2022

Please correct me if I'm wrong, but the changes fix LLVM-13+ builds, so at the first we must ensure that the problem is solved, and we can do it only by building the code with Clang-13+, and pre-submit tests cannot help here, cuz they use either GGC or Clang-11/12 for macOS while the issue was raised against LLVM-13. Then the changes touch specific aarch64/arm/ppc/s390/x86 sources. To me that means that changes are to be verified with LLVM-13+ build for each supported architecture, and the regression must be done also for each supported architecture including ARM (pre-submit tests do not check ARM at all) with default toolkit. I tried to reproduce the issue for ARM by cross-build x86->ARM with Clang-13, and faced several issues. That why I asked you if you'd seen them. It looks like Clang build is not supported for ARM. That does not relate to your changes, that is the baseline problem, and I'm going to add such support. BTW, the issue is not reproducible for ARM.
26-04-2022

Sorry I don't get it. Do you mean the changes to .hpp and .cpp files under "src/hotspot/cpu/arm/" and "src/hotspot/cpu/aarch64/" are causing the ARM assembler syntax problem? For the original commit into JDK 18, I relied on the Github's pre-submit tests to check for ARM builds: https://github.com/openjdk/jdk/pull/6221/checks?check_run_id=4200833982. It passed for "hs arm" build and "hs aarch64" build. I haven't manually checked when backporting it to JDK 11 for ARM though.
21-04-2022

Hi! Did your check the changes for ARM? I'm trying to verify the backport, but even the baseline fails thanks to the fact that GCC and Clang use different ARM Assembler syntax. For some unknown reason Clang swears at valid 's' suffix for ADD{cond}{S} instructions. On the other hand GGC uses very strange STR/LDR{cond}H syntax. ARM Assembler Guide tells it should STRH/LDRH{cond}, and Clang uses exactly this reading.
21-04-2022

No problem or objection with the backport from our side. We intentionally left some TODOs in this change so it is smaller and easier to backport.
06-04-2022

Hi! I'm going to backport this one. Any comments, remarks, objections? Thank you
06-04-2022

Changeset: 8c5f0304 Author: Man Cao <manc@openjdk.org> Date: 2021-11-12 22:34:10 +0000 URL: https://git.openjdk.java.net/jdk/commit/8c5f03049196e66a4f8411bdd853b287134e7ce5
12-11-2021

ILW = Crash in C1 when building, only known with LLVM 13, no workaround = HLH = P2
03-11-2021