JDK-8340212 : -Xshare:off -XX:CompressedClassSpaceBaseAddress=0x40001000000 crashes on macos-aarch64
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 24
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2024-09-16
  • Updated: 2024-11-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 24
24Unresolved
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
I found the following while trying to reproduce something else.

$ ../build/fastdebug/images/jdk/bin/java -Xshare:off -Xlog:metaspace*=info,gc+heap+coops=debug -XX:CompressedClassSpaceBaseAddress=0x40001000000 -version

Gives:
#  Internal Error (/Users/stefank/git/alt2/open/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp:5095), pid=13014, tid=10243
#  guarantee((shifted_base & 0xffff0000ffffffff) == 0) failed: compressed class base bad alignment

V  [libjvm.dylib+0xcd6c18]  MacroAssembler::klass_decode_mode()+0x19c
V  [libjvm.dylib+0xcd6eb0]  MacroAssembler::decode_klass_not_null(Register, Register)+0x48
V  [libjvm.dylib+0xcd4d38]  MacroAssembler::load_klass(Register, Register)+0x168
V  [libjvm.dylib+0x829af4]  InterpreterMacroAssembler::profile_obj_type(Register, Address const&)+0x240
V  [libjvm.dylib+0x82ab6c]  InterpreterMacroAssembler::profile_return_type(Register, Register, Register)+0x3a8
V  [libjvm.dylib+0x10486cc]  TemplateInterpreterGenerator::generate_return_entry_for(TosState, int, unsigned long)+0x200

It's unclear to me if this can only happen with the CompressedClassSpaceBaseAddress flag, which is a "development" flag and not avilable in release builds, or if we would hit this failure if the OS ever handed back this address.
Comments
ProblemLIst entries were added by JEP 450: +# Fails with +UseCompactObjectHeaders on aarch64 +runtime/cds/appcds/SharedBaseAddress.java 8340212 linux-aarch64,macosx-aarch64 +runtime/cds/SharedBaseAddress.java 8340212 linux-aarch64,macosx-aarch64
12-11-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/21695 Date: 2024-10-24 20:49:45 +0000
24-10-2024

Yes I was planning to send out this PR post Lilliput. That patch has problem listed the CDS tests that failed. I'm happy for any simplifications you have. I refactored klass_decode_mode() to return whether it was going to succeed or not. It's kind of awkward.
21-10-2024

[~coleenp] Yes, thanks, this is what I had in mind. Do you plan this for post Lilliput? I may have small suggestions for simplifying, but I like this.
21-10-2024

> The right way to handle this would be to determine the klass decode mode at VM init, as part of the normal class space setup. Then we could bail out with an exit instead of asserting. This is what I did. But there is also a bug in SharedBaseAddress with the same assert, but less dubious than using this CompressedClassBaseAddress flag. Asserting the first time we decode a klass pointer is too late. This is failing in normal UseCompactHeaders runs, so needs to be fixed. It is a bug. Here's my draft PR. https://github.com/openjdk/jdk/pull/21610 Actually this commit: https://github.com/openjdk/jdk/pull/21610/commits/891126888effe7fbd075db1a300fed37f743f810
21-10-2024

> The right way to handle this would be to determine the klass decode mode at VM init, as part of the normal class space setup. Then we could bail out with an exit instead of asserting. Right. That's what I would prefer, if possible.
21-10-2024

[~stefank] I agree, it wastes time. What about improving the assertion messages in "MacroAssembler::klass_decode_mode" instead? That would help for both CompressedClassSpaceBaseAddress and invalid runtime addresses both (the latter case should be astronomically improbable).
21-10-2024

[~coleenp] "CompressedClassSpaceBaseAddress is a debug testing option and expects to exit if mapping fails, so it should also test whether the shift also is compatible with aarch64 encoding. " That was not my intention when I introduced the switch. Its sole purpose was to be able to feed any address to narrow Klass initialization to observe its behavior. Its a sharp-edge option, and assertions when passing in invalid addresses are expected. If we add sanity checks to this switch, we need to mimic the behavior of initialization on aarch64, in a way duplicating that code. But I agree that the assertion message is bad; it could be improved.
21-10-2024

Not counter checking it wastes time for developers that are not familiar with the flag. I would have much preferred an earlier exist with an explanation that the suggested address isn't valid.
21-10-2024

[~coleenp] [~stefank] I don't think this is a bug. On aarch64, not every address can be used as encoding base, because we did not implement a fallback plain addition. Somewhat simplified, the base needs to have zeros in the lower 35 (for non lilliput) bits. CompressedClassSpaceBaseAddress does not guarantee that; its a debug-only test switch you can use to pass any address in, that is by design. It does not countercheck the value, nor should it. It is possible to pass an invalid value that would not work. At runtime, we take care to set the encoding base at a valid address for aarch64.
21-10-2024

[~stefank] "I would have much preferred an earlier exist with an explanation that the suggested address isn't valid. " Note that a lot of that coding long predates CompressedClassSpaceBaseAddress. MacroAssembler::klass_decode_mode is determined lazily, after VM initialization. The only thing one can do then is to assert, which we do. The right way to handle this would be to determine the klass decode mode at VM init, as part of the normal class space setup. Then we could bail out with an exit instead of asserting. But making the assertion message better (beyond "this is not a valid encoding base address") is a bit of a challenge too since such a description would be intricate and lengthy ("must be either < 4G or constitute a valid EOR immediate and be aligned to 32G, or be < 256TB and be aligned to 32G" ... ). The usefulness of such a message is questionable, and it would be prone to become obsolete if the code is touched.
21-10-2024

We should problem list the test runtime/cds/appcds/SharedBaseAddress.java for the Lilliput integration until this is fixed.
16-10-2024

This bug with -Xshare:off using -XX:CompressedClassSpaceBaseAddress and the one with the tested Lilliput patch which uses -Xshare:on and dumps with -XX:SharedBaseAddress are similar problems with code that's similar in different places. The compressed class space is mapped at the address given in both cases, ignoring that on AARCH64 the encoding scheme requires that the base address and shift be able to use the MOVK instruction. If the mapping is successful at the given address, it won't call this function CompressedKlassPointers::reserve_address_space_for_compressed_classes. CompressedClassSpaceBaseAddress is a debug testing option and expects to exit if mapping fails, so it should also test whether the shift also is compatible with aarch64 encoding. For SharedBaseAddress, the mapping should check for the shift and fail over to the OS mapping if the mapping is incompatible for aarch64. This MOVK optimization was put in for a linked issue JDK-8234794 by [~ngasson]. I still haven't found all of this code.
14-10-2024

Call stack obtained using slowdebug build: * thread #3, stop reason = signal SIGSTOP frame #0: 0x000000019aa8c704 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x000000019aac3c28 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x000000019a9d1ae8 libsystem_c.dylib`abort + 180 frame #3: 0x0000000105ed8c68 libjvm.dylib`os::abort(dump_core=true, siginfo=0x0000000000000000, context=0x0000000000000000) at os_posix.cpp:2126:5 frame #4: 0x00000001061f6b68 libjvm.dylib`VMError::report_and_die(id=-536870912, message="guarantee((shifted_base & 0xffff0000ffffffff) == 0) failed", detail_fmt="compressed class base bad alignment", detail_args=" s<m\U00000001", thread=0x000000012f008e10, pc=0x0000000000000000, siginfo=0x0000000000000000, context=0x0000000000000000, filename=".../open/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp", lineno=5095, size=0) at vmError.cpp:1946:5 frame #5: 0x00000001061f6c20 libjvm.dylib`VMError::report_and_die(thread=0x000000012f008e10, context=0x0000000000000000, filename=".../open/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp", lineno=5095, message="guarantee((shifted_base & 0xffff0000ffffffff) == 0) failed", detail_fmt="compressed class base bad alignment", detail_args=" s<m\U00000001") at vmError.cpp:1611:3 frame #6: 0x000000010557cbf0 libjvm.dylib`report_vm_error(file=".../open/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp", line=5095, error_msg="guarantee((shifted_base & 0xffff0000ffffffff) == 0) failed", detail_fmt="compressed class base bad alignment") at debug.cpp:193:3 frame #7: 0x0000000105d775bc libjvm.dylib`MacroAssembler::klass_decode_mode(this=0x000000012f00e420) at macroAssembler_aarch64.cpp:5094:3 frame #8: 0x0000000105d778c8 libjvm.dylib`MacroAssembler::decode_klass_not_null(this=0x000000012f00e420, dst=(_encoding = 2), src=(_encoding = 2)) at macroAssembler_aarch64.cpp:5140:11 frame #9: 0x0000000105d75bf8 libjvm.dylib`MacroAssembler::decode_klass_not_null(this=0x000000012f00e420, r=(_encoding = 2)) at macroAssembler_aarch64.cpp:5179:3 frame #10: 0x0000000105d75b40 libjvm.dylib`MacroAssembler::load_klass(this=0x000000012f00e420, dst=(_encoding = 2), src=(_encoding = 2)) at macroAssembler_aarch64.cpp:4844:5 frame #11: 0x0000000105938904 libjvm.dylib`InterpreterMacroAssembler::profile_obj_type(this=0x000000012f00e420, obj=(_encoding = 2), mdo_addr=0x000000016d3c7a70) at interp_masm_aarch64.cpp:1545:3 frame #12: 0x00000001059398b4 libjvm.dylib`InterpreterMacroAssembler::profile_return_type(this=0x000000012f00e420, mdp=(_encoding = 1), ret=(_encoding = 0), tmp=(_encoding = 2)) at interp_masm_aarch64.cpp:1693:5 * frame #13: 0x000000010610b8b0 libjvm.dylib`TemplateInterpreterGenerator::generate_return_entry_for(this=0x000000016d3cea30, state=atos, step=1, index_size=2) at templateInterpreterGenerator_aarch64.cpp:479:8 frame #14: 0x0000000106107224 libjvm.dylib`TemplateInterpreterGenerator::generate_all(this=0x000000016d3cea30) at templateInterpreterGenerator.cpp:86:20 frame #15: 0x0000000106106f60 libjvm.dylib`TemplateInterpreterGenerator::TemplateInterpreterGenerator(this=0x000000016d3cea30) at templateInterpreterGenerator.cpp:40:3 frame #16: 0x00000001061085f8 libjvm.dylib`TemplateInterpreterGenerator::TemplateInterpreterGenerator(this=0x000000016d3cea30) at templateInterpreterGenerator.cpp:37:94 frame #17: 0x0000000106105034 libjvm.dylib`TemplateInterpreter::initialize_code() at templateInterpreter.cpp:67:34 frame #18: 0x000000010593c234 libjvm.dylib`interpreter_init_code() at interpreter.cpp:142:3 frame #19: 0x000000010590a444 libjvm.dylib`init_globals2() at init.cpp:164:3 frame #20: 0x000000010614b5ec libjvm.dylib`Threads::create_vm(args=0x000000016d3cef48, canTryAgain=0x000000016d3cee8b) at threads.cpp:570:12 frame #21: 0x0000000105a69f64 libjvm.dylib`JNI_CreateJavaVM_inner(vm=0x000000016d3cef40, penv=0x000000016d3cef38, args=0x000000016d3cef48) at jni.cpp:3596:12 frame #22: 0x0000000105a69eb8 libjvm.dylib`::JNI_CreateJavaVM(vm=0x000000016d3cef40, penv=0x000000016d3cef38, args=0x000000016d3cef48) at jni.cpp:3687:14 frame #23: 0x0000000102d4e570 libjli.dylib`JavaMain [inlined] InitializeJVM(pvm=0x000000016d3cef40, penv=0x000000016d3cef38) at java.c:1490:9 [opt] frame #24: 0x0000000102d4e4bc libjli.dylib`JavaMain(_args=<unavailable>) at java.c:488:10 [opt] frame #25: 0x0000000102d5160c libjli.dylib`ThreadJavaMain(args=<unavailable>) at java_md_macosx.m:687:29 [opt] frame #26: 0x000000019aac3fa8 libsystem_pthread.dylib`_pthread_start + 148 (lldb) fr sel 7 frame #7: 0x0000000105d775bc libjvm.dylib`MacroAssembler::klass_decode_mode(this=0x000000012f00e420) at macroAssembler_aarch64.cpp:5094:3 5091 5092 const uint64_t shifted_base = 5093 (uint64_t)CompressedKlassPointers::base() >> CompressedKlassPointers::shift(); -> 5094 guarantee((shifted_base & 0xffff0000ffffffff) == 0, 5095 "compressed class base bad alignment"); 5096 5097 return (_klass_decode_mode = KlassDecodeMovk); (lldb) p/x shifted_base (const uint64_t) $9 = 0x0000040001000000 (lldb) p CompressedKlassPointers::_base (address) $10 = 0x0000040001000000 "" (lldb) p CompressedKlassPointers::_shift (int) $11 = 0
07-10-2024

I cannot reproduce a similar bug with this symptom that we found in testing the patch for https://github.com/openjdk/jdk/pull/20677 with current JDK code. I can reproduce the original bug report that requires -Xshare:off.
04-10-2024

>>> Failures in this run are apparently the same as this bug: >> Have you verified that? > I have not. You said it was related in slack and I dumped this here to have this information when I have a chance to investigate it. Well I didn't say that they were guaranteed to be related.
01-10-2024

>> Failures in this run are apparently the same as this bug: > Have you verified that? I have not. You said it was related in slack and I dumped this here to have this information when I have a chance to investigate it.
01-10-2024

I saw the same crash with JDK 22.0.2 and 23.0.1 fastdebug builds on macos-aarch64.
17-09-2024

I tried JDK 22 and 23 fastdebug builds on linux-x64 and didn't see the crash.
17-09-2024

I've not tested this with JDK 23 and earlier.
16-09-2024