JDK-8265984 : Concurrent GC: Some tests fail "assert(is_frame_safe(f)) failed: Frame must be safe"
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 17
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: x86_64
  • Submitted: 2021-04-26
  • Updated: 2021-06-15
  • Resolved: 2021-05-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17
17 b21Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The following test failed in the JDK17 CI:

applications/runthese/RunThese30M.java

Here's a snippet from the log file:

[stress.process.out] Verifying the constraints on the serial fields
[stress.process.out] ConstructorTest0003 passed
[2021-04-26T12:11:07.400644200Z] Gathering output for process 50396
# To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=\\stackWatermark.cpp:179
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (t:\\workspace\\open\\src\\hotspot\\share\\runtime\\stackWatermark.cpp:179), pid=28660, tid=47916
#  assert(is_frame_safe(f)) failed: Frame must be safe
#
# JRE version: Java(TM) SE Runtime Environment (17.0+20) (fastdebug build 17-ea+20-LTS-1686)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 17-ea+20-LTS-1686, mixed mode, sharing, tiered, compressed class ptrs, z gc, windows-amd64)
# Core dump will be written. Default location: T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese30M_java\\scratch\\0\\hs_err_pid28660.mdmp
#
# An error report file with more information is saved as:
# T:\\testoutput\\test-support\\jtreg_closed_test_hotspot_jtreg_applications_runthese_RunThese30M_java\\scratch\\0\\hs_err_pid28660.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
----------System.err:(1200/94003)*----------


Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x0000014c4836bc70):  JavaThread "logback-appender-logstash-logstash-prod.s0.javaplatfo1iad.oraclevcn.com:10074-4" daemon [_thread_in_Java, id=47916, stack(0x00000032adc00000,0x00000032add00000)]

Stack: [0x00000032adc00000,0x00000032add00000]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0xacc441]  os::platform_print_native_stack+0xf1  (os_windows_x86.cpp:235)
V  [jvm.dll+0xcef0d7]  VMError::report+0xfc7  (vmError.cpp:739)
V  [jvm.dll+0xcf0a5e]  VMError::report_and_die+0x7fe  (vmError.cpp:1549)
V  [jvm.dll+0xcf1174]  VMError::report_and_die+0x64  (vmError.cpp:1330)
V  [jvm.dll+0x4ce787]  report_vm_error+0xb7  (debug.cpp:282)
V  [jvm.dll+0xbb4b18]  StackWatermark::assert_is_frame_safe+0xd8  (stackWatermark.cpp:179)
V  [jvm.dll+0xbb5b06]  StackWatermarkSet::after_unwind+0x1a6  (stackWatermarkSet.cpp:93)
V  [jvm.dll+0xb71193]  OptoRuntime::rethrow_C+0x23  (runtime.cpp:1425)
C  0x0000014a2e2b7255

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~RuntimeStub::_rethrow_Java
J 2043 c1 sun.nio.ch.NioSocketImpl.read([BII)I java.base@17-ea (96 bytes) @ 0x0000014a2eb253f4 [0x0000014a2eb24c80+0x0000000000000774]
J 2042 c1 sun.nio.ch.NioSocketImpl$1.read([BII)I java.base@17-ea (11 bytes) @ 0x0000014a2eb2467c [0x0000014a2eb245e0+0x000000000000009c]
J 4291 c1 java.net.Socket$SocketInputStream.read()I java.base@17-ea (28 bytes) @ 0x0000014a2eda1174 [0x0000014a2eda0f20+0x0000000000000254]
j  net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler$ReaderRunnable.run()V+14
j  java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@17-ea
j  java.util.concurrent.FutureTask.run()V+39 java.base@17-ea
j  java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+28 java.base@17-ea
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@17-ea
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@17-ea
j  java.lang.Thread.run()V+11 java.base@17-ea
v  ~StubRoutines::call_stub

I'm starting this bug off in hotspot/gc since the assert failure
is in StackWatermark::assert_is_frame_safe() and this is ZGC.
Comments
The fix for this bug is in jdk-17+21-1828.
07-05-2021

Changeset: 194bceca Author: Per Liden <pliden@openjdk.org> Date: 2021-05-03 13:34:49 +0000 URL: https://git.openjdk.java.net/jdk/commit/194bceca3a4d13d4528b86359ee9d5eead3ce7ac
03-05-2021

Here's hs_err_pid snippet from the jdk-17+21-1799-tier8 sighting: # Internal Error (/opt/mach5/mesos/work_dir/slaves/3c846bae-ce30-4a97-93ee-9fef4497ccb6-S79219/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/d4772d2b-167d-4359-95f0-eed8c685d2ba/runs/ecd72654-4829-47b1-b8cc-267a829ee24b/workspace/open/src/hotspot/share/runtime/stackWatermark.cpp:179), pid=17559, tid=17602 # assert(is_frame_safe(f)) failed: Frame must be safe # # JRE version: Java(TM) SE Runtime Environment (17.0+21) (fastdebug build 17-ea+21-LTS-1799) # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 17-ea+21-LTS-1799, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x177379f] StackWatermark::assert_is_frame_safe(frame const&)+0x6f --------------- T H R E A D --------------- Current thread (0x00007f5f00186240): JavaThread "logback-appender-logstash-logstash-prod.s0.javaplatfo1iad.oraclevcn.com:10074-4" daemon [_thread_in_Java, id=17602, stack(0x00007f6165af2000,0x00007f6165bf3000)] Stack: [0x00007f6165af2000,0x00007f6165bf3000], sp=0x00007f6165bf0240, free space=1016k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x177379f] StackWatermark::assert_is_frame_safe(frame const&)+0x6f V [libjvm.so+0x17754fb] StackWatermark::after_unwind()+0x1cb V [libjvm.so+0x1774e28] StackWatermarkSet::after_unwind(JavaThread*)+0x38 V [libjvm.so+0x16e7d12] OptoRuntime::rethrow_C(oopDesc*, JavaThread*, unsigned char*)+0x22 v ~RuntimeStub::_rethrow_Java J 1928 c1 sun.nio.ch.NioSocketImpl$1.read([BII)I java.base@17-ea (11 bytes) @ 0x00007f618e14457c [0x00007f618e1444e0+0x000000000000009c] J 4032 c1 java.net.Socket$SocketInputStream.read()I java.base@17-ea (28 bytes) @ 0x00007f618de81d4c [0x00007f618de81b00+0x000000000000024c] j net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler$ReaderRunnable.run()V+14 j java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@17-ea j java.util.concurrent.FutureTask.run()V+39 java.base@17-ea j java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+28 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@17-ea j java.lang.Thread.run()V+11 java.base@17-ea v ~StubRoutines::call_stub V [libjvm.so+0xe91bb5] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x595 V [libjvm.so+0xe92415] JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x4c5 V [libjvm.so+0xe92897] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x77 V [libjvm.so+0x100a0bb] thread_entry(JavaThread*, Thread*)+0x12b V [libjvm.so+0x18b61a1] JavaThread::thread_main_inner()+0x271 V [libjvm.so+0x18bdad0] Thread::call_run()+0x100 V [libjvm.so+0x15949ee] thread_native_entry(Thread*)+0x10e Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~RuntimeStub::_rethrow_Java J 1928 c1 sun.nio.ch.NioSocketImpl$1.read([BII)I java.base@17-ea (11 bytes) @ 0x00007f618e14457c [0x00007f618e1444e0+0x000000000000009c] J 4032 c1 java.net.Socket$SocketInputStream.read()I java.base@17-ea (28 bytes) @ 0x00007f618de81d4c [0x00007f618de81b00+0x000000000000024c] j net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler$ReaderRunnable.run()V+14 j java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@17-ea j java.util.concurrent.FutureTask.run()V+39 java.base@17-ea j java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+28 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@17-ea j java.lang.Thread.run()V+11 java.base@17-ea v ~StubRoutines::call_stub
02-05-2021

[~pliden] I did some experiments with Shenandoah GC, I added following code to ShenandoahStackwatermark::start_processing_impl() if (_jt->has_last_Java_frame()) { StackFrameStream sfs(_jt, false, false); frame* cur = sfs.current(); assert(cur->cb() == NULL || !cur->cb()->is_exception_stub(), "Caught it!!!"); } The assertion actually fired. Following is stack dump I got: Stack: [0x00007f86be173000,0x00007f86be274000], sp=0x00007f86be26c370, free space=996k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x1868124] ShenandoahStackWatermark::start_processing_impl(void*)+0x324 V [libjvm.so+0x18cc86c] StackWatermark::on_safepoint()+0x6c V [libjvm.so+0x170782a] SafepointMechanism::process_if_requested_slow(JavaThread*)+0x3a V [libjvm.so+0x16f9bf8] OptoRuntime::handle_exception_C_helper(JavaThread*, nmethod*&)+0xd78 V [libjvm.so+0x16fa369] OptoRuntime::handle_exception_C(JavaThread*)+0x59 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~ExceptionBlob J 6360 c2 com.sun.tools.javac.comp.Resolve.rawInstantiate(Lcom/sun/tools/javac/comp/Env;Lcom/sun/tools/javac/code/Type;Lcom/sun/tools/javac/code/Symbol;Lcom/sun/tools/javac/comp/Attr$ResultInfo;Lcom/sun/tools/javac/util/List;Lcom/sun/tools/javac/util/List;ZZLcom/sun/tools/javac/util/Warner;)Lcom/sun/tools/javac/code/Type; jdk.compiler@17-internal (508 bytes) @ 0x00007f86a96459b0 [0x00007f86a9644840+0x0000000000001170] So this particular thread has ExceptionBlob on top of stack, while the nmethod that threw this exception already popped, so that it is invisible, after the thread got out of safepoint. Disassemble last nmethod (com.sun.tools.javac.comp.Resolve.rawInstantiate) and at the offset 0x00007f86a9644840+0x0000000000001170 = 0x7f86a96459b0 0x00007f86a9645991: mov %r14,%rsi 0x00007f86a9645994: mov 0x38(%rsp),%rdx 0x00007f86a9645999: mov 0x70(%rsp),%rcx 0x00007f86a964599e: mov 0x48(%rsp),%r8 0x00007f86a96459a3: mov 0xe8(%rsp),%rdi 0x00007f86a96459ab: callq 0x00007f86a98192e0 0x00007f86a96459b0: mov 0x70(%rsp),%r10 0x00007f86a96459b5: mov 0x24(%r10),%r11d so, it looks like the exception is thrown from the call to 0x00007f86a98192e0 is public void argumentsAcceptable(com.sun.tools.javac.comp.Env, com.sun.tools.javac.comp.DeferredAttr$DeferredAttrContext, com.sun.tools.javac.util.List, com.sun.tools.javac.util.List, com.sun.tools.javac.util.Warner) [signature (Lcom.sun.tools.javac.comp.Env<Lcom.sun.tools.javac.comp.AttrContext;>;Lcom.sun.tools.javac.comp.DeferredAttr$DeferredAttrContext;Lcom.sun.tools.javac.util.List<Lcom.sun.tools.javac.code.Type;>;Lcom.sun.tools.javac.util.List<Lcom.sun.tools.javac.code.Type;>;Lcom.sun.tools.javac.util.Warner;)V] which is not on stack.
30-04-2021

[~zgu] Hmm, the C2 frame should not have been popped at this point, so something doesn't look right here. I assigned the bug to me and I'll send out the above patch, which will fix the problem for now by just reverting to the old behavior. However, I'll also create a new bug so that we can track this and look into the underlying issue at some point later.
30-04-2021

[~pliden] The C2 runtime call may come from a C2 frame that is already popped, so that it is not visible. From what I saw, stack watermark missed caller frame, as in is_frame_safe(), it has f.sp() == _iterator->caller(). I could not figure out root cause, would you mind taking over? Thanks.
29-04-2021

A workaround for now might be to just do something like: --- a/src/hotspot/share/opto/runtime.cpp +++ b/src/hotspot/share/opto/runtime.cpp @@ -1377,7 +1377,10 @@ address OptoRuntime::handle_exception_C(JavaThread* current) { // deoptimized frame if (nm != NULL) { - RegisterMap map(current, false /* update_map */, false /* process_frames */); + // Enable WXWrite: the function called directly by compiled code. + MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXWrite, current)); + + RegisterMap map(current, false /* update_map */, true /* process_frames */); frame caller = current->last_frame().sender(&map); #ifdef ASSERT assert(caller.is_compiled_frame(), "must be");
29-04-2021

One strange thing here is that a C1 compiled method calls into C2's OptoRuntime::rethrow_C.
29-04-2021

[~zgu] I agree that change could be the one triggering this, but process_frames should be false here since we're not touching oops. It seems to me that this just happened to have uncovered a different/pre-existing bug, where we're not processing frames as we should but because we used to do it here we got away with it. Letting process_frames be false only for macos/aarch64 would not be the correct solution here, since macos/aarch64 would still have the bug.
29-04-2021

Shenandoah nightly test sees the similar problem quite often lately on Linux x86_64. The suspicious change seems came from JDK-8265702: --- a/src/hotspot/share/opto/runtime.cpp +++ b/src/hotspot/share/opto/runtime.cpp @@ -1377,7 +1377,7 @@ address OptoRuntime::handle_exception_C(JavaThread* current) { // deoptimized frame if (nm != NULL) { - RegisterMap map(current, false); + RegisterMap map(current, false /* update_map */, false /* process_frames */); frame caller = current->last_frame().sender(&map); #ifdef ASSERT assert(caller.is_compiled_frame(), "must be"); I wonder if this change should only apply to macOS/aarch64.
29-04-2021

Here the crashing stack for the jdk-17+20-1712-tier3 sighting: --------------- T H R E A D --------------- Current thread (0x00007f65cc219e00): JavaThread "logback-appender-logstash-logstash-prod.s0.javaplatfo1iad.oraclevcn.com:10074-4" daemon [_thread_in_Java, id=743859, stack(0x00007f66284ef000,0x00007f66285f0000)] Stack: [0x00007f66284ef000,0x00007f66285f0000], sp=0x00007f66285ed260, free space=1016k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x1772ddf] StackWatermark::assert_is_frame_safe(frame const&)+0x6f V [libjvm.so+0x1774b3b] StackWatermark::after_unwind()+0x1cb V [libjvm.so+0x1774468] StackWatermarkSet::after_unwind(JavaThread*)+0x38 V [libjvm.so+0x16e7442] OptoRuntime::rethrow_C(oopDesc*, JavaThread*, unsigned char*)+0x22 v ~RuntimeStub::_rethrow_Java J 3905 c1 java.net.Socket$SocketInputStream.read()I java.base@17-ea (28 bytes) @ 0x00007f68498231cc [0x00007f6849822f80+0x000000000000024c] j net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler$ReaderRunnable.run()V+14 j java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@17-ea j java.util.concurrent.FutureTask.run()V+39 java.base@17-ea j java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+28 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@17-ea j java.lang.Thread.run()V+11 java.base@17-ea v ~StubRoutines::call_stub V [libjvm.so+0xe91805] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x595 V [libjvm.so+0xe92065] JavaCalls::call_virtual(JavaValue*, Klass*, Symbol*, Symbol*, JavaCallArguments*, Thread*)+0x4c5 V [libjvm.so+0xe924e7] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x77 V [libjvm.so+0x1009a5b] thread_entry(JavaThread*, Thread*)+0x12b V [libjvm.so+0x18b5d41] JavaThread::thread_main_inner()+0x271 V [libjvm.so+0x18bd780] Thread::call_run()+0x100 V [libjvm.so+0x159416e] thread_native_entry(Thread*)+0x10e Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~RuntimeStub::_rethrow_Java J 3905 c1 java.net.Socket$SocketInputStream.read()I java.base@17-ea (28 bytes) @ 0x00007f68498231cc [0x00007f6849822f80+0x000000000000024c] j net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler$ReaderRunnable.run()V+14 j java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object;+4 java.base@17-ea j java.util.concurrent.FutureTask.run()V+39 java.base@17-ea j java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V+28 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 java.base@17-ea j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@17-ea j java.lang.Thread.run()V+11 java.base@17-ea v ~StubRoutines::call_stub
27-04-2021