JDK-8194652 : VMError::print_native_stack() is missing an os::is_first_C_frame() check
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 9,10,11
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: os_x,solaris_11
  • CPU: x86_64
  • Submitted: 2018-01-04
  • Updated: 2018-02-16
  • Resolved: 2018-01-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b01Fixed
Related Reports
Relates :  
Description
While working on some stack walking code, I happened
to notice that the VMError::print_native_stack() function
is missing an os::is_first_C_frame() call before one of the
os::get_sender_for_C_frame() calls.

Because of this omission, an hs_err_pid file will sometimes
show a stack like this:

---------------  T H R E A D  ---------------

Current thread (0x00007f83f3801800):  JavaThread "main" [_thread_in_vm, id=4099, stack(0x000000010aea3000,0x000000010afa3000)]

Stack: [0x000000010aea3000,0x000000010afa3000],  sp=0x000000010afa2950,  free space=1022k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x116ec55]  crash_with_segfault()+0x15
V  [libjvm.dylib+0x116c96d]  VMError::controlled_crash(int)+0x56d
V  [libjvm.dylib+0x116ec37]  VMError::test_error_handler()+0x17
V  [libjvm.dylib+0xad41ba]  JNI_CreateJavaVM_inner(JavaVM_**, void**, void*)+0x39a
V  [libjvm.dylib+0xad3e0c]  JNI_CreateJavaVM+0x2c
C  [java+0x6ac8]  InitializeJVM+0x138
C  [java+0x5c3c]  JavaMain+0xac
C  [libsystem_pthread.dylib+0x405a]  _pthread_body+0x83
C  [libsystem_pthread.dylib+0x3fd7]  _pthread_body+0x0
C  [libsystem_pthread.dylib+0x13ed]  thread_start+0xd

[error occurred during error reporting (printing native stack), id 0xb]


siginfo: ...

when it should show a stack like this:

---------------  T H R E A D  ---------------

Current thread (0x00007f8ca2005000):  JavaThread "main" [_thread_in_vm, id=4099, stack(0x00000001059d0000,0x0000000105ad0000)]

Stack: [0x00000001059d0000,0x0000000105ad0000],  sp=0x0000000105acf950,  free space=1022k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x116ec55]  crash_with_segfault()+0x15
V  [libjvm.dylib+0x116c96d]  VMError::controlled_crash(int)+0x56d
V  [libjvm.dylib+0x116ec37]  VMError::test_error_handler()+0x17
V  [libjvm.dylib+0xad41ba]  JNI_CreateJavaVM_inner(JavaVM_**, void**, void*)+0x39a
V  [libjvm.dylib+0xad3e0c]  JNI_CreateJavaVM+0x2c
C  [java+0x6ac8]  InitializeJVM+0x138
C  [java+0x5c3c]  JavaMain+0xac
C  [libsystem_pthread.dylib+0x405a]  _pthread_body+0x83
C  [libsystem_pthread.dylib+0x3fd7]  _pthread_body+0x0
C  [libsystem_pthread.dylib+0x13ed]  thread_start+0xd


siginfo: ....


The "[error occurred during error reporting (printing native stack), id 0xb]"
line shouldn't be there and if VMError::print_native_stack() is called
outside of the error handler context, then you would see that the
"error occurred during error reporting" is an assertion failure:

frame #16: 0x000000010af69c0e libjvm.dylib`frame::init(this=0x000000012d876f20, sp=0x000000012d879f88, fp=0x0000000000000000, pc=0x0000000000000000) + 142 at frame_x86.inline.hpp:49
   46  	  _unextended_sp = sp;
   47  	  _fp = fp;
   48  	  _pc = pc;
-> 49  	  assert(pc != NULL, "no pc?");
Comments
Ran the new test without the fix through Mach5 and examined the resulting hs_err_pid files. The bug does not reproduce on: Linux-X64-debug Solaris SPARCv9-debug Win-X64-debug The bug does reproduce on OSX-debug in Mach5. I've also reproduced this bug on OSX and Solaris X64-debug in my lab. As luck would have it, OSX and Solaris X64 are the two dev environments that I happen to have running in my lab at the moment.
05-01-2018

The fix is pretty simple: $ hg diff src/hotspot/share/utilities/vmError.cpp diff -r abf1d797e380 src/hotspot/share/utilities/vmError.cpp --- a/src/hotspot/share/utilities/vmError.cpp Wed Jan 03 17:29:20 2018 +0000 +++ b/src/hotspot/share/utilities/vmError.cpp Thu Jan 04 17:26:13 2018 -0500 @@ -243,11 +243,12 @@ RegisterMap map((JavaThread*)t, false); // No update fr = fr.sender(&map); } else { + // is_first_C_frame() does only simple checks for frame pointer, + // it will pass if java compiled code has a pointer in EBP. + if (os::is_first_C_frame(&fr)) break; fr = os::get_sender_for_C_frame(&fr); } } else { - // is_first_C_frame() does only simple checks for frame pointer, - // it will pass if java compiled code has a pointer in EBP. if (os::is_first_C_frame(&fr)) break; fr = os::get_sender_for_C_frame(&fr); } I'm also adding a new test to detect the problem: test/hotspot/jtreg/runtime/ErrorHandling/BadNativeStackInErrorHandlingTest.java
04-01-2018

It took some spelunking, but I tracked the origin of the code back to: $ hg log -r 23204 changeset: 23204:e5c4480df562 user: kvn date: Wed Mar 05 16:21:22 2014 -0800 summary: 8035983: Fix "Native frames:" in crash report (hs_err file) $ hg diff -r 23203 -r 23204 hotspot/src/share/vm/utilities/vmError.cpp diff -r 76c2a23e2537 -r e5c4480df562 hotspot/src/share/vm/utilities/vmError.cpp --- a/hotspot/src/share/vm/utilities/vmError.cpp Wed Mar 05 15:14:25 2014 +0100 +++ b/hotspot/src/share/vm/utilities/vmError.cpp Wed Mar 05 16:21:22 2014 -0800 @@ -592,13 +592,24 @@ st->cr(); // Compiled code may use EBP register on x86 so it looks like // non-walkable C frame. Use frame.sender() for java frames. - if (_thread && _thread->is_Java_thread() && fr.is_java_frame()) { - RegisterMap map((JavaThread*)_thread, false); // No update - fr = fr.sender(&map); - continue; + if (_thread && _thread->is_Java_thread()) { + // Catch very first native frame by using stack address. + // For JavaThread stack_base and stack_size should be set. + if (!_thread->on_local_stack((address)(fr.sender_sp() + 1))) { + break; + } + if (fr.is_java_frame()) { + RegisterMap map((JavaThread*)_thread, false); // No update + fr = fr.sender(&map); + } else { + fr = os::get_sender_for_C_frame(&fr); + } + } else { + // is_first_C_frame() does only simple checks for frame pointer, + // it will pass if java compiled code has a pointer in EBP. + if (os::is_first_C_frame(&fr)) break; + fr = os::get_sender_for_C_frame(&fr); } - if (os::is_first_C_frame(&fr)) break; - fr = os::get_sender_for_C_frame(&fr); } if (count > StackPrintLimit) { The fix for JDK-8035983 added an earlier os::get_sender_for_C_frame() call, but did not duplicate the "if (os::is_first_C_frame(&fr)) break;" check.
04-01-2018