JDK-8247691 : [aarch64] Incorrect handling of VM exceptions in C1 deopt stub/traps
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,15
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • CPU: aarch64
  • Submitted: 2020-06-16
  • Updated: 2024-02-06
  • Resolved: 2020-09-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 16 JDK 8
11.0.11-oracleFixed 16 b18Fixed 8u411Fixed
Related Reports
Relates :  
Description
Crash in test/hotspot/jtreg/runtime/Thread/StopAtExit.java

---------------  T H R E A D  ---------------

Current thread (0x0000fffe2802fa00):  JavaThread "Thread-2" [_thread_in_Java, id=28188, stack(0x0000fffe647e0000,0x0000fffe649e0000)]

Stack: [0x0000fffe647e0000,0x0000fffe649e0000],  sp=0x0000fffe649dd280,  free space=2036k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa17358]  ScopeDesc::objects()+0x0
V  [libjvm.so+0x4c1350]  Deoptimization::fetch_unroll_info(JavaThread*, int)+0x20
v  ~DeoptimizationBlob
v  ~RuntimeStub::load_mirror_patching Runtime1 stub
C  0x0000fffe649de860

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
v  ~DeoptimizationBlob
v  ~RuntimeStub::load_mirror_patching Runtime1 stub
C  0x0000fffe649de4b0
C  0x0000000000000000
V  [libjvm.so+0x49c140]  Thread::in_retryable_allocation() const+0x0
C  0xd65f03c052800000

[error occurred during error reporting (printing Java stack), id 0xb, SIGSEGV (0xb) at pc=0x0000fffea05aea40]


siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000028

Comments
Fix Request (Revoked: Need different solution for 11u. See below.) Should get backported for parity with 11.0.11-oracle. Doesn't apply cleanly. Review thread: http://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-January/004691.html
01-03-2021

I have checked the jdk16 code. The problem doesn't occur because revoke_biases_of_monitors is no longer used in Deoptimization::deoptimize (both, BiasedLocking and Deoptimization were significantly changed). But this change can't get directly applied to 11u: Calling deoptimize_frame while _thread_in_Java is not allowed. I guess Runtime1::patch_code could get rewritten to use JRT_ENTRY_NO_ASYNC for the backport.
22-01-2021

Hi Andrew, no, I'm saying that we call deoptimize_frame while the thread is in state _thread_in_Java *** WITH *** this patch: JavaThread "main" [_thread_in_Java, ... I guess this is also problematic in jdk16 where the original patch was integrated, especially when BiasedLocking gets enabled.
22-01-2021

Hi Martin, I don't really understand this comment. Are you saying that in JDK 11, without this patch, we call deoptimize_frame while the thread is in state _thread_in_Java?
22-01-2021

Reproduction with jdk11u-dev: - Apply the patch http://cr.openjdk.java.net/~mdoerr/8247691_aarch64_11u/webrev.00/jdk11u-dev.changeset - Configure fastdebug build - make images Assertion showed up while "Creating images".
22-01-2021

I have observed it once during make images. I'll make more experiments with 11u. Nevertheless, it looks weird to call deoptimize_frame while the thread is in state _thread_in_Java. I guess this is the only place where that is done. If async exceptions are the problem, shouldn't JRT_ENTRY_NO_ASYNC be used?
21-01-2021

While testing in 11u, I got: # Internal Error (/usr/work/openjdk/nb/linuxaarch64/nightly/jdk11u-dev/src/hotspot/share/runtime/thread.cpp:995), pid=27513, tid=27522 # fatal error: LEAF method calling lock? V [libjvm.so+0x8b64e0] report_fatal(char const*, int, char const*, ...)+0x108 V [libjvm.so+0x1474fd4] Thread::check_for_valid_safepoint_state(bool)+0xbc V [libjvm.so+0x1589900] VMThread::execute(VM_Operation*)+0xc0 V [libjvm.so+0x465908] BiasedLocking::revoke(GrowableArray<Handle>*)+0xa0 V [libjvm.so+0x8dd70c] Deoptimization::revoke_biases_of_monitors(JavaThread*, frame, RegisterMap*)+0x34c V [libjvm.so+0x8de0a8] Deoptimization::deoptimize(JavaThread*, frame, RegisterMap*, Deoptimization::DeoptReason)+0x168 V [libjvm.so+0x8de538] Deoptimization::deoptimize_frame_internal(JavaThread*, long*, Deoptimization::DeoptReason)+0x118 V [libjvm.so+0x8de6fc] Deoptimization::deoptimize_frame(JavaThread*, long*)+0x6c V [libjvm.so+0x5ca7c8] Runtime1::patch_code(JavaThread*, Runtime1::StubID)+0xe0 V [libjvm.so+0x5ca9b8] Runtime1::move_mirror_patching(JavaThread*)+0x38 v ~RuntimeStub::load_mirror_patching Runtime1 stub C 0x0000ffffa02a9e9e Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v ~RuntimeStub::load_mirror_patching Runtime1 stub J 412 c1 java.io.BufferedReader.readLine(Z)Ljava/lang/String; java.base (304 bytes) @ 0x0000ffffa0fe96e8 [0x0000ffffa0fe9440+0x00000000000002a8] Seems like original change is broken with BiasedLocking which was not noticed because it’s off in JDK15 and later. Can anybody verify this change with BiasedLocking in head revision, please?
21-01-2021

Can you provide a reproducer?
21-01-2021

Changeset: 7817963c Author: Patric Hedlin <phedlin@openjdk.org> Date: 2020-09-26 18:24:11 +0000 URL: https://git.openjdk.java.net/jdk/commit/7817963c
26-09-2020

Let me try to rephrase that a bit. The patching stub-code we are looking at (which has nothing to do with patching really since we do not patch on Aarch64), should be simplified NOT to enable VM transition in the 'target' code, i.e. we should not allow nor require that 'patch_code' is a JRT_ENTRY (and yes, we should rename these routines NOT to use the word, patch/ing). We might want to check/assert that no such events have occurred while invoking 'target' though. So, while your patch does correct the case with the broken link register, we should not have to worry about any exceptions after the 'target' call. Does this make sense to you?
23-09-2020

I think the above fix is not the correct one since we should not need to make/enable a VM transition when calling 'target'. Instead (as in the lines just below your change) we could just make the stub-code explicitly check that there are no pending exceptions. If a VM transition was intended the code does need to handle pending exceptions. Such is the contract. However, in this case we will always deoptimize so any transition needed will be dealt with at/after that point (by the interpreter). (Removed sample code since I'm forgetting the need for explicit st/ld of fp&lr.)
23-09-2020

The root cause seems to be the implementation of patch_code (in src/hotspot/share/c1/c1_Runtime1.cpp), which is modelled as a JRT_ENTRY (but should not need to be a JRT_ENTRY). Possibly not invoked accordingly. JRT_ENTRY(void, Runtime1::patch_code(JavaThread* thread, Runtime1::StubID stub_id)) (This /could be/is/ a potential source to numerous issues.) Update: Invocation stub overwrites the link register used in exception handling code.
23-09-2020

It looks to me that this issue is not related to pending exceptions. Here, the aarch64 LR register has been destroyed by this call: 540 // do the call 541 __ lea(rscratch1, RuntimeAddress(target)); 542 __ blr(rscratch1); 543 __ bind(retaddr); According to the aarch64 reference manual, LR register contains "target" after the blr instruction. That means LR register does not contains the throwing pc now at the line of my change. But what we need to put in register r3 is the throwing pc which is available on the stack (located at : rfp + wordSize) So my change loads from this address and put the throwing pc into register r3. Hope that explains my proposed fix.
23-09-2020

Hi, we encountered the same bug in our local test and has a fix for it. We can reproduced this with jdk16, jdk11 and 8u-aarch64 port. Please check if the following fix works for you. Thanks. diff --git a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp index a356cc2cbd7..93422d33853 100644 --- a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp @@ -579,7 +579,8 @@ OopMapSet* Runtime1::generate_patching(StubAssembler* sasm, address target) { __ verify_not_null_oop(r0); // load throwing pc: this is the return address of the stub - __ mov(r3, lr); + // Note that lr register has been destroyed by the call. + __ ldr(r3, Address(rfp, wordSize)); #ifdef ASSERT // check that fields in JavaThread for exception oop and issuing pc are empty
22-09-2020

The part (of the stack) where the deopt is on-top the patching stub looks all wrong. v ~DeoptimizationBlob v ~RuntimeStub::load_mirror_patching Runtime1 stub
21-09-2020