Bug ID: JDK-8173699 Crash during deoptimization with "assert(result == __null || result->is

Type: Bug
Component: hotspot
Sub-Component: compiler
Affected Version: 9,10

Priority: P2
Status: Closed
Resolution: Fixed

Submitted: 2017-01-31
Updated: 2017-08-07
Resolved: 2017-02-03

JDK 10	JDK 9
10Fixed	9 b159Fixed

#  Internal Error (/oracle/8173373/hotspot/src/share/vm/runtime/deoptimization.cpp:231), pid=10386, tid=10405
#  assert(result == __null || result->is_oop()) failed: must be oop

Current CompileTask:
JVMCI:  55566 15288    b  4       java.lang.invoke.MethodHandles$Lookup::checkAccess (226 bytes)

Stack: [0x00007f7ad007b000,0x00007f7ad017c000],  sp=0x00007f7ad0174470,  free space=997k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x15b4a5f]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x18f
V  [libjvm.so+0x15b585a]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
V  [libjvm.so+0xa446da]  report_vm_error(char const*, int, char const*, char const*, ...)+0xea
V  [libjvm.so+0xa6eea0]  Deoptimization::fetch_unroll_info_helper(JavaThread*, int)+0x20a0
V  [libjvm.so+0xa6f47e]  Deoptimization::fetch_unroll_info(JavaThread*, int)+0x8e
v  ~DeoptimizationBlob
J 15177 jvmci java.lang.invoke.MemberName$Factory.resolve(BLjava/lang/invoke/MemberName;Ljava/lang/Class;)Ljava/lang/invoke/MemberName; java.base@9-internal (143 bytes) @ 0x00007f7addbdf05c [0x00007f7addbdf000+0x000000000000005c] (null)

Verified by manual testing.
07-08-2017
Thanks Tom and Dean! Here's my summary: The Graal compiled method java.lang.invoke.MemberName$Factory::resolve() calls into the method handle runtime via MethodHandleNatives::resolve() which throws a NoSuchMethodError because method resolution failed (see methodHandles.cpp, line 1234). We then call into JVMCIRuntime::exception_handler_for_pc() -> SharedRuntime::compute_compiled_exc_handler() to determine the appropriate exception handler. Because the ExceptionHandlerTable has no entry for this pc, we deoptimize and return to the DeoptimizationBlob at offset _unpack_with_exception_in_tls which calls Deoptimization::fetch_unroll_info(). Since the callee returns a MemberName object, the ScopeDesc is marked as return_oop() and the re-allocation code expects the return register (eax) to contain the oop of the returned object. We fail when trying to save the oop, because eax contains not an oop but the address of SharedRuntime::deopt_blob()->unpack_with_exception_in_tls() which was returned from JVMCIRuntime::exception_handler_for_pc() right before and is therefore still in eax. As Tom suggested in the bug comments, we should ignore the return_oop() when dispatching an exception and only try to retrieve the oop when performing re-allocation during a normal deoptimization (if exec_mode == Unpack_deopt). http://cr.openjdk.java.net/~thartmann/8173699/webrev.00/ This problem only affects JVMCI compiled code. C1 does not set return_oop() because it does not eliminate allocations (see IRScopeDebugInfo::record_debug_info()) and therefore does not need to re-allocate objects on deoptimization. C2 computes the exception handler via OptoRuntime::handle_exception_C() and uses DeoptimizationBlob::_unpack_with_exception as handler in case the nmethod was deoptimized. When calling Deoptimization::fetch_unroll_info() from the DeoptimizationBlob, eax still contains the exception oop and therefore the code works "by accident" because the exception oop is treated as return oop. I agree that this code should be refactored and filed JDK-8173823 to fix this with JDK 10.
02-02-2017
I just got done investigating the same issue with 8173795 before I realized it's a duplicate. I agree with Tom's evaluation.
02-02-2017
I think the fact that we're deopting through unpack_with_exception_in_tls is the problem. The regular unpack_with_exception has the exception oop in the return register so this code would work ok for C2, though it's looking at the return register for the wrong reasons. I suspect the code should really be this: - bool save_oop_result = chunk->at(0)->scope()->return_oop() && !thread->popframe_forcing_deopt_reexecution(); + bool save_oop_result = chunk->at(0)->scope()->return_oop() && !thread->popframe_forcing_deopt_reexecution() && exec_mode == Unpack_deopt; so that we don't examine the return_oop when dispatching an exception. I'm actually unclear why we need return_oop at all. The deopt code knows whether it's returning somewhere in the interpreter where there is a value on the top of stack. Why can't it figure out that it's returning to a location that expects atos and only preserve the value in that case? I guess currently that logic lives way at the end of the deopt in vframeArrayElement::unpack_on_stack in vframeArray.cpp. If you look at the end of that method you can see that for that for all the exec_modes except Unpack_deopt we reset the deopt entry for the top frame to expect vtos so preserving the value based on return_oop is clearly either useless or wrong in those cases. I think a refactoring of that logic so it could be used earlier could completely replace the return_oop flag.
01-02-2017
Compiled method (JVMCI) 76896 15142 ! 4 java.lang.invoke.MemberName$Factory::resolve (143 bytes) total in heap [0x00007fffdda2ac10,0x00007fffdda2be30] = 4640 relocation [0x00007fffdda2ad98,0x00007fffdda2adf0] = 88 main code [0x00007fffdda2ae00,0x00007fffdda2b3a0] = 1440 stub code [0x00007fffdda2b3a0,0x00007fffdda2b3d0] = 48 oops [0x00007fffdda2b3d0,0x00007fffdda2b3d8] = 8 metadata [0x00007fffdda2b3d8,0x00007fffdda2b470] = 152 scopes data [0x00007fffdda2b470,0x00007fffdda2b7a8] = 824 scopes pcs [0x00007fffdda2b7a8,0x00007fffdda2be28] = 1664 dependencies [0x00007fffdda2be28,0x00007fffdda2be30] = 8 Register map: rax [0x00007fff8bffb7f8] = 0x00007fffd274c1a8 rax [0x00007fff8bffb7fc] = <misaligned> (gdb) call findpc(0x00007fffd274c1a8) "Executing findpc" 0x00007fffd274c1a8 is at code_begin+456 in [CodeBlob (0x00007fffd274bf10)] Framesize: 356 DeoptimizationBlob Which is "DeoptimizationBlob::_unpack_with_exception_in_tls" and obviously not an oop.
01-02-2017
Thanks Doug and Tom! I also don't think that JDK-8171087 is related. Please let me know if you find anything, I'll have a look as well.
01-02-2017
I don't think the ARM64 failure is related to this, at least at first glance. That one just looks like a generic bad oop issue. This one is most likely incorrect setting of the return_oop flag on the scope. It could also be a real corrupted oop issue but I'll try to rule out the return_oop issue first. I'm going to try inspecting the code Graal generates for MemberName$Factory.resolve to see if there's anything obviously wrong there.
31-01-2017
I don't see how my test can be causing this. After discussion with [~never], we think it's related to JDK-8169938. Tom will add more notes after a bit of investigation.
31-01-2017
ILW = Crash during deoptimization, only with one JVMCI test and -Xcomp, no workaround = HLH = P2
31-01-2017

Blocks :	JDK-8173795 - AOT support in raw_exception_handler_for_return_address is broken
Blocks :	JDK-8173794 - [REDO] [AOT] Missing GC scan of _metaspace_got array containing Klass*
Relates :	JDK-8169938 - [AOT] SIGSEGV at ~BufferBlob::vtable chunks
Relates :	JDK-8173823 - Handling of oop returning call sites in Deoptimization::fetch_unroll_info_helper() must be refactored
Relates :	JDK-8172733 - [JVMCI] add ResolvedJavaMethod.hasNeverInlineDirective