Bug ID: JDK-8358725 RunThese30M: assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it)

JDK-8358725 : RunThese30M: assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it)

Type: Bug
Component: hotspot
Sub-Component: compiler
Affected Version: 25

Priority: P3
Status: Open
Resolution: Unresolved
OS: linux
CPU: x86_64

Submitted: 2025-06-05
Updated: 2025-11-26

Versions (Unresolved/Resolved/Fixed)

The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.

Other
tbdUnresolved

Related Reports

Relates :	JDK-8153892 - Handle unsafe access error directly in signal handler instead of going through a stub
Relates :	JDK-8153890 - Handle unsafe access error as an asynchronous exception
Relates :	JDK-8283044 - Use asynchronous handshakes to deliver asynchronous exceptions

Description

The following test failed in the JDK25 CI:

applications/runthese/RunThese30M.java

Here's a snippet from the log file:

[2025-06-05T15:19:05.538134385Z] Gathering output for process 32035
[2025-06-05T15:19:11.724201934Z] Waiting for completion for process 32035
[2025-06-05T15:19:11.724347782Z] Waiting for completion finished for process 32035
[stress.process.out] #
[stress.process.out] # A fatal error has been detected by the Java Runtime Environment:
[stress.process.out] #
[stress.process.out] #  Internal Error (/opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S121083/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/05e968ad-8bba-478a-b05e-31e05d88d04b/runs/bd4451c0-744f-425d-8c59-e0b08306739d/workspace/open/src/hotspot/cpu/x86/frame_x86.cpp:461), pid=28961, tid=32196
[stress.process.out] #  assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it) original_pc: 0x0000000000000000 unextended_sp: 0x00007f22971f0690 name: nmethod
[stress.process.out] #
[stress.process.out] # JRE version: Java(TM) SE Runtime Environment (25.0+26) (fastdebug build 25-ea+26-LTS-3324)
[stress.process.out] # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 25-ea+26-LTS-3324, compiled mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
[stress.process.out] # Problematic frame:
[stress.process.out] # V  [libjvm.so+0xd8ad99]  frame::verify_deopt_original_pc(nmethod*, long*)+0xa9
[stress.process.out] #
[stress.process.out] # Core dump will be written. Default location: Core dumps may be processed with "/opt/core.sh %p" (or dumping to /opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S383720/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/055a71ea-6d83-4807-967f-2f9d68766275/runs/c3f550b3-9d8e-4be1-9b4e-b5256ebda3fc/testoutput/test-support/jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese30M_java/scratch/0/core.28961)
[stress.process.out] #
[stress.process.out] # JFR recording file will be written. Location: /opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S383720/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/055a71ea-6d83-4807-967f-2f9d68766275/runs/c3f550b3-9d8e-4be1-9b4e-b5256ebda3fc/testoutput/test-support/jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese30M_java/scratch/0/hs_err_pid28961.jfr
[stress.process.out] #
[stress.process.out] Unsupported internal testing APIs have been used.
[stress.process.out] 
[stress.process.out] # An error report file with more information is saved as:
[stress.process.out] # /opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S383720/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/055a71ea-6d83-4807-967f-2f9d68766275/runs/c3f550b3-9d8e-4be1-9b4e-b5256ebda3fc/testoutput/test-support/jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese30M_java/scratch/0/hs_err_pid28961.log
[stress.process.out] [thread 32074 also had an error]
[stress.process.out] [thread 31585 also had an error]
[stress.process.out] [thread 32141 also had an error]
[stress.process.out] [thread 32189 also had an error]
[stress.process.out] #
[stress.process.out] # If you would like to submit a bug report, please visit:
[stress.process.out] #   https://bugreport.java.com/bugreport/crash.jsp
[stress.process.out] #
[2025-06-05T15:19:58.205976017Z] Gathering output for process 32283
[2025-06-05T15:19:58.424917669Z] Waiting for completion for process 32283
[2025-06-05T15:19:58.425021517Z] Waiting for completion finished for process 32283
[2025-06-05T15:20:11.727607373Z] Gathering output for process 32349
[2025-06-05T15:20:11.739110122Z] Waiting for completion for process 32349
[2025-06-05T15:20:11.739180355Z] Waiting for completion finished for process 32349
[2025-06-05T15:20:11.740999250Z] Gathering output for process 32352
[2025-06-05T15:20:17.925972852Z] Waiting for completion for process 32352
[2025-06-05T15:20:17.926083262Z] Waiting for completion finished for process 32352


Stress process failed. See stress.process.err/stress.process.out files for details.

Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x00007f21f83f7c10):  JavaThread "Thread-1220" daemon [_thread_in_Java, id=32196, stack(0x00007f22970f2000,0x00007f22971f2000) (1024K)]

Stack: [0x00007f22970f2000,0x00007f22971f2000],  sp=0x00007f22971ef6e8,  free space=1013k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xd8ad99]  frame::verify_deopt_original_pc(nmethod*, long*)+0xa9  (frame_x86.cpp:461)
V  [libjvm.so+0x1762bea]  os::get_sender_for_C_frame(frame*)+0x6a  (frame_x86.inline.hpp:114)
V  [libjvm.so+0x1767942]  os::get_native_stack(unsigned char**, int, int)+0x222  (os_posix.cpp:251)
V  [libjvm.so+0x6517b5]  AllocateHeap(unsigned long, MemTag, AllocFailStrategy::AllocFailEnum)+0x75  (allocation.cpp:50)
V  [libjvm.so+0x18fd8f9]  SharedRuntime::handle_unsafe_access(JavaThread*, unsigned char*)+0x39  (allocation.hpp:127)
V  [libjvm.so+0x17632d4]  PosixSignals::pd_hotspot_signal_handler(int, siginfo*, ucontext*, JavaThread*)+0x344  (os_linux_x86.cpp:284)
V  [libjvm.so+0x192aba1]  JVM_handle_linux_signal+0x1d1  (signals_posix.cpp:642)
C  [libc.so.6+0x36400]

[error occurred during error reporting (printing native stack (with source info)), id 0xe0000000, Internal Error (/opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S121083/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/05e968ad-8bba-478a-b05e-31e05d88d04b/runs/bd4451c0-744f-425d-8c59-e0b08306739d/workspace/open/src/hotspot/cpu/x86/frame_x86.cpp:461)]

Retrying call stack printing without source information...
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xd8ad99]  frame::verify_deopt_original_pc(nmethod*, long*)+0xa9  (frame_x86.cpp:461)
V  [libjvm.so+0x1762bea]  os::get_sender_for_C_frame(frame*)+0x6a
V  [libjvm.so+0x1767942]  os::get_native_stack(unsigned char**, int, int)+0x222
V  [libjvm.so+0x6517b5]  AllocateHeap(unsigned long, MemTag, AllocFailStrategy::AllocFailEnum)+0x75
V  [libjvm.so+0x18fd8f9]  SharedRuntime::handle_unsafe_access(JavaThread*, unsigned char*)+0x39
V  [libjvm.so+0x17632d4]  PosixSignals::pd_hotspot_signal_handler(int, siginfo*, ucontext*, JavaThread*)+0x344
V  [libjvm.so+0x192aba1]  JVM_handle_linux_signal+0x1d1
C  [libc.so.6+0x36400]

[error occurred during error reporting (retry printing native stack (no source info)), id 0xe0000000, Internal Error (/opt/mach5/mesos/work_dir/slaves/d2398cde-9325-49c3-b030-8961a4f0a253-S121083/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/05e968ad-8bba-478a-b05e-31e05d88d04b/runs/bd4451c0-744f-425d-8c59-e0b08306739d/workspace/open/src/hotspot/cpu/x86/frame_x86.cpp:461)]

This failure mode appears to be a close match for the following
issue that was closed as a duplicate:

JDK-8351028 RunThese30M: assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it)

However, this failure did not happen with Graal so I'm opening a new issue
as requested in JDK-8351028:

https://bugs.openjdk.org/browse/JDK-8351028?focusedId=14777915&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14777915

Tobias Hartmann added a comment - 2025-05-08 03:04 - Restricted to Confidential - edited
Thanks for checking [~never]. The attached hs_err file is actually from a failure that we don't see anymore, I think. The latest failures all have JFR on the stack trace. I attached a corresponding file (hs_err_pid17384.log).

https://mach5.us.oracle.com/mdash/testHistory?search=status%3Afailed%20AND%20reasons.details%3A*compiled*method*or*must*be*immediately*following*

Update: Ah, I see that https://github.com/openjdk/jdk/commit/90f0f1b88badbf1f72d7b9434621457aa47cde30 was pushed very recently. So yes, if that's really the root cause, let's close as duplicate and open a new bug if we observe a non-Graal issue again.

Comments

I think my first attempt at a fix, to make the stack walk smarter, was misguided. I think a better fix is to not call new or malloc from a signal handler. Currently we use handshakes with an async closure to throw the actual unsafe access exception, but the async closure and associated async handshake op are allocated with new. We could try extending handshakesto allow pre-allocated async closures/ops, perhaps attached to the Thread, that get recycled instead of deleted, but I don't think we have time left in 26 for that. I think we should defer this to 27.
26-11-2025
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/27985 Date: 2025-10-24 23:09:20 +0000
24-10-2025
On a linux x64 fastdebug build, with inlining and a NMT_TrackingStackDepth of 4, os::get_native_stack does try to walk past the JVM_handle_linux_signal to its caller, which is a special frame setup by the kernel to return to the saved context using a rt_sigreturn syscall instead of a regular function return, so it is unsafe to walk past this frame. We have various checks like os::is_first_C_frame() to stop the stack walk, but it can be fooled by random values on the stack since the "caller" of the signal handler is not a regular frame.
18-10-2025
We can trigger this assertion sometimes on linuxx86_64 when using async-profiler with the jaxp and langtools jtreg tests : # Internal Error (/priv/jenkins/client-home/workspace/openjdk-jdk-dev-linux_x86_64-dbg/jdk/src/hotspot/cpu/x86/frame_x86.cpp:461), pid=41016, tid=41020 # assert(nm->insts_contains_inclusive(original_pc)) failed: original PC must be in the main code section of the compiled method (or must be immediately following it) original_pc: 0x00007f21b3bc5880 unextended_sp: 0x00007f21b3bc5850 name: native nmethod # # JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.sapmachine.jdk) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.sapmachine.jdk, mixed mode, sharing, tiered, compressed oops, compact obj headers, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xdda6b6] frame::verify_deopt_original_pc(nmethod, long)+0xa6 # --------------- T H R E A D --------------- Current thread (0x00007f21ac7ddc40): JavaThread "main" [_thread_in_Java, id=41020, stack(0x00007f21b3ac9000,0x00007f21b3bc9000) (1024K)] Stack: [0x00007f21b3ac9000,0x00007f21b3bc9000], sp=0x00007f21b3bc4b28, free space=1006k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xdda6b6] frame::verify_deopt_original_pc(nmethod, long)+0xa6 (frame_x86.cpp:461) V [libjvm.so+0x10e7be4] JavaThread::pd_get_top_frame(frame, void, bool) [clone .part.0]+0xd4 (frame_x86.inline.hpp:114) V [libjvm.so+0xdcc262] AsyncGetCallTrace+0x202 (forte.cpp:656) C [libasyncProfiler.so+0x2157a] Profiler::getJavaTraceAsync(void, ASGCT_CallFrame, int, StackContext)+0x3fa (profiler.cpp:422) C [libasyncProfiler.so+0x3f218] Profiler::recordSample(void, unsigned long long, EventType, Event)+0x318 (profiler.cpp:659) C [libasyncProfiler.so+0x41092] WallClock::signalHandler(int, siginfo_t, void)+0x132 (wallClock.cpp:134) C [libc.so.6+0x42520] V [libjvm.so+0x103e0c7] InstanceKlass::uncached_lookup_method(Symbol const, Symbol const*, Klass::OverpassLookupMode, Klass::PrivateLookupMode) const+0x77 (instanceKlass.cpp:2128) C [ld-linux-x86-64.so.2+0x14d0c] Registers: RAX=0x00007f21b6615000, RBX=0x00007f219bdbd608, RCX=0x00007f21b5cf08e8, RDX=0x00007f21b5cf0990 RSP=0x00007f21b3bc4b28, RBP=0x00007f21b3bc4b80, RSI=0x00000000000001cd, RDI=0x00007f21b5cf0618 R8 =0x00007f21b3bc5880, R9 =0x00007f21b3bc5850, R10=0x00007f21b2dc9000, R11=0x00007f21ae6df6b0 R12=0x00007f21b3bc5850, R13=0x00007f21ac7ddc40, R14=0x00007f21b3bc4ba0, R15=0x00007f21b62f7b64 RIP=0x00007f21b4a8f6b6, EFLAGS=0x0000000000010216, CSGSFS=0x002b000000000033, ERR=0x0000000000000006 TRAPNO=0x000000000000000e
12-08-2025
[~dlong] yes you are right. We should not be executing ` thread->set_pending_unsafe_access_error()` in a signal handling context as it does a lot of things that are not signal safe.
10-06-2025
[~dholmes], yes, that's normally how how we do it, but SharedRuntime::handle_unsafe_access() is not following this pattern. Instead, it calls thread->set_pending_unsafe_access_error() from the signal handler context, and returns next_pc as the "stub", effectively skipping over the instruction that caused the signal. There is no unsafe_access_handler stub that we redirect to upon returning from the signal handler.
09-06-2025
[~dlong] doesn't the signal handler simply update the return pc based on the selected stub (in this case the unsafe_access_handler stub) such that we are no longer executing in the signal handling context if (stub != nullptr) { // save all thread context in case we need to restore it if (thread != nullptr) thread->set_saved_exception_pc(pc); os::Posix::ucontext_set_pc(uc, stub); return true; }
08-06-2025
Using this filter https://mach5.us.oracle.com/mdash/testHistory?search=status%3Afailed%20AND%20reasons.details%3Acompiledmethodormustbeimmediatelyfollowing%20AND%20!products.JDK.vmOptions%3AUseGraalJIT I found an earlier crash from April 10.
07-06-2025
ILW = assert in debug build, seem twice so far, turn off NMT = MMM = P3
07-06-2025
I suspect the immediate problem is that the NMT backtrace gets confused when it hits a signal handler frame. Should the backtrace stop at that point, or skip the signal handler frame somehow? But maybe the larger issue is the fact that handle_unsafe_access() is called from a signal handler but tries to do things like allocate memory and run handshakes, which seems unsafe (no pun intended) to me.
06-06-2025