Bug ID: JDK-8358183 [JVMCI] crash accessing nmethod::jvmci

JDK 25	JDK 26
25Fixed	26 b05Fixed

Thanks [~bulasevich]! And thanks for clarifying, [~kvn] and [~dholmes].
11-07-2025
> and the jdk25u stabilization branch was created to accept bug fixes. Is that correct? For clarity, the jdk25 stabilization branch was created to accept bug fixes. In addition a jdk25u repository was created to accept fixes held back until 25u1.
11-07-2025
A pull request was submitted for review. Branch: jdk25 URL: https://git.openjdk.org/jdk/pull/26248 Date: 2025-07-10 17:40:20 +0000
10-07-2025
[~bulasevich] You can still push the into JDK 25 branch during RD1 since it is P3: https://openjdk.org/jeps/3#rdp-1 Just go into JDK 26 changeset and issue `/backport :jdk25` command to create PR for it.
10-07-2025
[~thartmann] Sorry, I’m not sure I understand your comment. Could you clarify what you mean by “JDK 25” in this context? With Rampdown Phase One, the mainline has advanced to JDK 26, and the jdk25u stabilization branch was created to accept bug fixes. Is that correct?
10-07-2025
[~bulasevich] I think the backport should go to JDK 25 and not JDK 25u, right?
10-07-2025
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk25u/pull/13 Date: 2025-07-04 15:06:23 +0000
04-07-2025
Changeset: 74822ce1 Branch: master Author: Boris Ulasevich <bulasevich@openjdk.org> Date: 2025-07-02 21:15:46 +0000 URL: https://git.openjdk.org/jdk/commit/74822ce12acaf9816aa49b75ab5817ced3710776
02-07-2025
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/25608 Date: 2025-06-03 06:39:18 +0000
04-06-2025
I can't really confirm that it fixes the problem by running the test as we've only ever seen one crash like this since JDK-8343789 went in over 3 weeks ago. The test itself is a stress test which is run somewhat regularly. So I can run some testing to confirm that it doesn't cause any other problems but I won't be able to confirm that it fixes the issue. I'll launch some testing but I think you can put that fix up for review.
03-06-2025
ILW = Crash during printing of diagnostic info, intermittent with JVMCI and -XX:+PrintCodeHeapAnalytics or corresponding jcmd, no known workaround but disable printing = HLM = P3
03-06-2025
Tom, I’ve submitted a Draft PR here: https://github.com/openjdk/jdk/pull/25608/files It zeroes out the three size fields in nmethod::purge(): _mutable_data_size = 0; _relocation_size = 0; _metadata_size = 0; This ensures that after a purge: jvmci_data_size() returns 0, jvmci_nmethod_data() returns nullptr, CompileBroker::print_heapinfo() skips the JVMCI name printout (no invalid _metadata dereference) I have tried running a long-lived JVMCI VM task with constant polling via jcmd to trigger the original crash, but I haven’t been able to reproduce it. Could you check whether my change fixes the print_heapinfo() crash?
03-06-2025
It's a single crash from internal closed testing for which we have a core file. So I don't think it's easy to reproduce. It requires one thread to be in the middle ClassUnloadingContext::purge_and_free_nmethods and another to be in the jcmd that's printing the heap analytics while running with JVMCI as the JIT Here are the relevant sanitized stack traces: #19 0x00007fd058aa8ae0 in __strlen_avx2 () from /ren24/lib64/libc.so.6 #20 0x00007fd056fc94a0 in CodeHeapState::aggregate (out=out@entry=0x7fd05430cca0, heap=0x7fd050111c10, granularity=<optimized out>, granularity@entry=4096) at /workspace/open/src/hotspot/share/code/codeHeapState.cpp:740 #21 0x00007fd056fba22b in CodeCache::aggregate (out=out@entry=0x7fd05430cca0, granularity=granularity@entry=4096) at /workspace/open/src/hotspot/share/code/codeCache.cpp:1878 #22 0x00007fd057016a8a in CompileBroker::print_heapinfo (out=0x7fd05430cca0, function=<optimized out>, granularity=4096) at /workspace/open/src/hotspot/share/compiler/compileBroker.cpp:2867 #23 0x00007fd0571aee9b in DCmd::Executor::execute (this=this@entry=0x7fd05430cbc0, command=command@entry=0x7fd05023b8d0, __the_thread__=__the_thread__@entry=0x7fd05023aee0) at /workspace/open/src/hotspot/share/services/diagnosticFramework.cpp:421 #24 0x00007fd056c81e24 in Executor::execute (this=0x7fd05430cbc0, command=0x7fd05023b8d0, __the_thread__=0x7fd05023aee0) at /workspace/open/src/hotspot/share/services/attachListener.cpp:394 #25 0x00007fd0571b30fe in DCmd::Executor::parse_and_execute (this=this@entry=0x7fd05430cbc0, cmdline=<optimized out>, delim=delim@entry=32 ' ', __the_thread__=__the_thread__@entry=0x7fd05023aee0) at /workspace/open/src/hotspot/share/services/diagnosticFramework.cpp:414 #26 0x00007fd056c8031b in jcmd (op=<optimized out>, out=0x7fd05430cca0) at /workspace/open/src/hotspot/share/services/attachListener.cpp:398 #27 0x00007fd056c82d08 in AttachListenerThread::thread_entry (thread=<optimized out>, __the_thread__=<optimized out>) at /workspace/open/src/hotspot/share/services/attachListener.cpp:639 #28 0x00007fd0575674db in JavaThread::thread_main_inner (this=0x7fd05023aee0) at /workspace/open/src/hotspot/share/runtime/javaThread.hpp:589 #29 0x00007fd057f969b6 in Thread::call_run (this=this@entry=0x7fd05023aee0) at /workspace/open/src/hotspot/share/runtime/thread.cpp:224 #30 0x00007fd057c257d8 in thread_native_entry (thread=0x7fd05023aee0) at /workspace/open/src/hotspot/os/linux/os_linux.cpp:870 Thread 8 (LWP 2245091): warning: Section `.reg-xstate/2245091' in core file too small. #0 0x00007fd0589d2a90 in __lll_lock_wait () from /ren24/lib64/libc.so.6 #1 0x00007fd0589d8c52 in pthread_mutex_lock@@GLIBC_2.2.5 () from /ren24/lib64/libc.so.6 #2 0x00007fd057b79dfd in PlatformMutex::lock (this=0x7fd0588f1858 <mutex_init()::CodeCache_lock_storage+8>) at /workspace/open/src/hotspot/os/posix/os_posix.inline.hpp:43 #3 Mutex::lock_without_safepoint_check (this=<optimized out>, self=<optimized out>) at /workspace/open/src/hotspot/share/runtime/mutex.cpp:147 #4 Mutex::lock_without_safepoint_check (this=0x7fd0588f1850 <mutex_init()::CodeCache_lock_storage>) at /workspace/open/src/hotspot/share/runtime/mutex.cpp:153 #5 0x00007fd057b96efc in MutexLockerImpl::MutexLockerImpl (this=<synthetic pointer>, mutex=0x7fd0588f1850 <mutex_init()::CodeCache_lock_storage>, flag=Mutex::SafepointCheckFlag::_no_safepoint_check_flag) at /workspace/open/src/hotspot/share/runtime/mutexLocker.hpp:199 #6 MutexLocker::MutexLocker (this=<synthetic pointer>, mutex=0x7fd0588f1850 <mutex_init()::CodeCache_lock_storage>, flag=Mutex::SafepointCheckFlag::_no_safepoint_check_flag) at /workspace/open/src/hotspot/share/runtime/mutexLocker.hpp:234 #7 nmethod::purge (this=0x7fd0387d5408, unregister_nmethod=true) at /workspace/open/src/hotspot/share/code/nmethod.cpp:2140 #8 0x00007fd056f96f85 in ClassUnloadingContext::purge_nmethods (this=this@entry=0x7fd0554a7af0) at /workspace/open/src/hotspot/share/gc/shared/classUnloadingContext.cpp:117 #9 0x00007fd058164648 in ClassUnloadingContext::purge_and_free_nmethods (this=<optimized out>) at /workspace/open/src/hotspot/share/gc/shared/classUnloadingContext.hpp:77 #10 ZNMethod::purge () at /workspace/open/src/hotspot/share/gc/z/zNMethod.cpp:421 #11 0x00007fd0581c749d in ZUnload::purge (this=this@entry=0x7fd05011c160) at /workspace/open/src/hotspot/share/gc/z/zUnload.cpp:163 #12 0x00007fd058127dc3 in ZGenerationOld::process_non_strong_references (this=this@entry=0x7fd05011a840) at /workspace/open/src/hotspot/share/gc/z/zGeneration.cpp:1342 #13 0x00007fd05812b046 in ZGenerationOld::concurrent_process_non_strong_references (this=0x7fd05011a840) at /workspace/open/src/hotspot/share/gc/z/zGeneration.cpp:1099 #14 ZGenerationOld::collect (this=0x7fd05011a840, timer=timer@entry=0x7fd05014b008) at /workspace/open/src/hotspot/share/gc/z/zGeneration.cpp:1008 #15 0x00007fd058121b61 in ZDriverMajor::collect_old (this=0x7fd05014aba0) at /workspace/open/src/hotspot/share/gc/z/zGeneration.inline.hpp:71 #16 ZDriverMajor::gc (this=this@entry=0x7fd05014aba0, request=...) at /workspace/open/src/hotspot/share/gc/z/zDriver.cpp:451 #17 0x00007fd058121cca in ZDriverMajor::run_thread (this=0x7fd05014aba0) at /workspace/open/src/hotspot/share/gc/z/zDriver.cpp:476 #18 0x00007fd0581be473 in ZThread::run_service (this=0x7fd05014aba0) at /workspace/open/src/hotspot/share/gc/z/zThread.cpp:28 #19 0x00007fd0570437fb in ConcurrentGCThread::run (this=0x7fd05014aba0) at /workspace/open/src/hotspot/share/gc/shared/concurrentGCThread.cpp:47 #20 0x00007fd057f969b6 in Thread::call_run (this=this@entry=0x7fd05014aba0) at /workspace/open/src/hotspot/share/runtime/thread.cpp:224 #21 0x00007fd057c257d8 in thread_native_entry (thread=0x7fd05014aba0) at /workspace/open/src/hotspot/os/linux/os_linux.cpp:870 #22 0x00007fd0589d57f2 in start_thread () from /ren24/lib64/libc.so.6 #23 0x00007fd058a5a880 in clone3 () from /ren24/lib64/libc.so.6 We'll just have to reason about how nmethod itself should protect itself when methods are called that require access to the now freed mutable data. We could make nmethod::jvmci_name return null if the mutable data has been freed, but it's probably better if jvmci_nmethod_data itself returns null in that case. It's not clear to me whether the other things stored in the mutable section should be protected by asserts that they haven't been freed. This problem would have been more obvious if jvmci_nmethod_data had an assert like: assert(jvmci_data_size() == 0 \|\| _mutable_data != blob_end(), "JVMCI data has already been freed"); I suspect that access to the JVMCI nmethod data after purging is occurring in more cases like this but doesn't crash because it's not the last blob in the mapped portion of the code heap. This crash is really because the bytes after the JVMCI nmethod data are in unmapped space.
02-06-2025
Thanks for filing this. A few clarifying questions to help reproduce and analyze the issue: Do you have any clue how to reproduce the bug? Is there a specific workload or test that triggers it reliably? Did you enable the -XX:+PrintCodeHeapAnalytics VM option, or use jcmd <pid> Compiler.CodeHeap_Analytics to force CompileBroker::print_heapinfo? Was this observed with a JVMCI build or a stock C2-only build? How stable is the reproduction? Does it occur consistently, intermittently, or only after long uptimes?
02-06-2025