JDK-8146416 : java.lang.OutOfMemoryError triggers: assert(current_bci == 0) failed: bci isn't zero for do_not_unlock_if_synchronized
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 8u101,9
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2016-01-04
  • Updated: 2016-09-13
  • Resolved: 2016-06-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
8-poolUnresolved 9 b126Fixed
Related Reports
Relates :  
# A fatal error has been detected by the Java Runtime Environment:
#  Internal Error (/opt/jprt/T/P1/172249.cthaling/s/hotspot/src/share/vm/interpreter/interpreterRuntime.cpp:435), pid=74047, tid=31491
#  assert(current_bci == 0) failed: bci isn't zero for do_not_unlock_if_synchronized
# JRE version: Java(TM) SE Runtime Environment (9.0) (build 9-internal+0-2015-12-31-172249.cthaling.aot)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (9-internal+0-2015-12-31-172249.cthaling.aot, mixed mode, tiered, jvmci, jvmci compiler, compressed oops, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp

---------------  T H R E A D  ---------------

Current thread (0x00007fcecb84c000):  JavaThread "JVMCI CompilerThread2" daemon [_thread_in_vm, id=31491, stack(0x0000000229ad0000,0x0000000229bd0000)]

Current CompileTask:
JVMCI:  62831 8365   !   4       com.oracle.graal.compiler.gen.NodeLIRBuilder::doBlock (678 bytes)

Stack: [0x0000000229ad0000,0x0000000229bd0000],  sp=0x0000000229bcdff0,  free space=1015k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0xabea1e]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x41c
V  [libjvm.dylib+0xabf18e]  VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x4a
V  [libjvm.dylib+0x417e03]  report_vm_error(char const*, int, char const*, char const*, ...)+0xcd
V  [libjvm.dylib+0x5ac2b5]  InterpreterRuntime::exception_handler_for_exception(JavaThread*, oopDesc*)+0x43f
j  com.oracle.graal.phases.common.CanonicalizerPhase$Instance.processWorkSet(Lcom/oracle/graal/nodes/StructuredGraph;)V+52
j  com.oracle.graal.phases.common.CanonicalizerPhase$Instance.run(Lcom/oracle/graal/nodes/StructuredGraph;)V+100
J 5800 C1 com.oracle.graal.phases.Phase.run(Lcom/oracle/graal/nodes/StructuredGraph;Ljava/lang/Object;)V (6 bytes) @ 0x0000000210a5c474 [0x0000000210a5c220+0x0000000000000254]
J 3296 C1 com.oracle.graal.phases.BasePhase.apply(Lcom/oracle/graal/nodes/StructuredGraph;Ljava/lang/Object;Z)V (514 bytes) @ 0x000000021068ee04 [0x000000021068de40+0x0000000000000fc4]
j  com.oracle.graal.phases.Phase.apply(Lcom/oracle/graal/nodes/StructuredGraph;Z)V+4
j  com.oracle.graal.phases.common.CanonicalizerPhase.applyIncremental(Lcom/oracle/graal/nodes/StructuredGraph;Lcom/oracle/graal/phases/tiers/PhaseContext;Ljava/lang/Iterable;Lcom/oracle/graal/graph/Graph$Mark;Z)V+16
j  com.oracle.graal.phases.common.IncrementalCanonicalizerPhase.run(Lcom/oracle/graal/nodes/StructuredGraph;Lcom/oracle/graal/phases/tiers/PhaseContext;)V+135
RULE "gc/lock/jniref/jnireflock01" Crash Internal Error ...interpreterRuntime.cpp...assert(current_bci == ...) failed: bci isn't zero for do_not_unlock_if_synchronized

Do you need more than the previous comment? Deferral request: Since it is not a regression, I see no reason that this is critical for July. I agree we should fix it, I have filed a backport to the October release

Yes, but it is not a regression, correct? I agree we should fix it, but I don't see a reason to make a last minute push to the July CPU

I opened a backport on the October release. I don't see anything suggesting this to be critical for July, though, suggest deferral

problem is seen in c2 too.

exception raised by deopt runtime(realloc failure) was not handled properly. this was the cause of undesired result! fix is ready, testing going on.

[~vlivanov], could you have a look?

Now with [~roland] gone we need someone else to pick this one up.

I didn't look closely at the interaction between jvmti and the handling of OOME in deopt if I remember correctly. I was just worried jvmti would mess things up and it didn't feel that heavyweight to disable EA in that case. So maybe (most likely) it's possible to remove the jvmti restriction but someone needs to take a close look at it.

Ok. Maybe something else is happening in the JVMCI case. I'll look into it more closely. Is this assert the only reason to disable EA in the presence of popframe? That seems like a heavyweight change and also sort of abitrary. The assert doesn't appear to be checking anything that's required for correct functioning. It's just checking that things look reasonable, so if it doesn't hold in the pop_frame case then I'd either remove the assert or add the pop_frame check to the assert. Would that be reasonable.

The code that was added for 6898462 is to better handle OOME during deoptimization when reallocation of eliminated allocations fail. rather than crash like we used to do, during deoptimization the current thread's frames_to_pop_failed_realloc is set to the number of interpreter frames resulting for deoptimization and an OOME is thrown. When we return to the interpreter the frames_to_pop_failed_realloc top frames are popped right away, ignoring any exception handler. If we fail to realloc an object, we can't relock its monitor if it has to be locked at the deoptimization point. So when we return to the interpreter and the frames are popped, if we let the interpreter unlock monitors, it would encounter a non reallocated object and crash. So rather than do that, given all frames are going to be popped anyway, the deoptimization code also unlocks all monitors that would be unlocked when the frames are popped (ignoring the failed reallocation). The problem is that we don't want the interpreter to do any unlocking when the frames are popped. That's where do_not_unlock_if_synchronized comes in. InterpreterMacroAssembler::remove_activation must not try to unlock if this is a synchronized method. In that scenario, InterpreterRuntime::exception_handler_for_exception() sets do_not_unlock_if_synchronized, returns right away to the interpreter that branches to remove_activation_entry that clears do_not_unlock_if_synchronized. So we should never get to InterpreterRuntime::exception_handler_for_exception with do_not_unlock_if_synchronized and hit the current_bci == 0 assert. FWIW, the change includes a test with a synchronized method that passes. The only thing that could mess things up AFAICT, is jvmti popframe.

But doesn't disabling it in the presence of pop frame just reduce the likelihood of seeing it in tests? It can still occur in real code. The failing test itself isn't using JVMTI so I don't think this will help. It seems like the assert that the bci == 0 is just no longer valid. I'm still concerned about this comment. // If the method is synchronized we already unlocked the monitor // during deoptimization so the interpreter needs to skip it when // the frame is popped. Why is this true? Is deopt explicitly unlocking all the held monitors somewhere?

As part of 6898462, C2 disables allocation elimination when JVMTI popframe is enabled (see below). Graal would need something similar I suppose. I wonder if we need to check for JvmtiExport::can_force_early_return() too. Anyway, could that be the problem with graal? diff --git a/src/share/vm/opto/macro.cpp b/src/share/vm/opto/macro.cpp --- a/src/share/vm/opto/macro.cpp +++ b/src/share/vm/opto/macro.cpp @@ -971,7 +971,11 @@ } bool PhaseMacroExpand::eliminate_allocate_node(AllocateNode *alloc) { - if (!EliminateAllocations || !alloc->_is_non_escaping) { + // Don't do scalar replacement if the frame can be popped by JVMTI: + // if reallocation fails during deoptimization we'll pop all + // interpreter frames for this compiled frame and that won't play + // nice with JVMTI popframe. + if (!EliminateAllocations || JvmtiExport::can_pop_frame() || !alloc->_is_non_escaping) { return false; } Node* klass = alloc->in(AllocateNode::KlassNode);

[~roland], what do you think?

7541: if (thread->frames_to_pop_failed_realloc() > 0) { ... 7541: thread->set_do_not_unlock_if_synchronized(true); (so this isn't popframe for jvmti I guess) changeset: 7541:4d5b47a5184d user: roland date: Tue Nov 25 17:33:59 2014 +0100 summary: 6898462: The escape analysis with G1 cause crash assertion src/share/vm/runtime/vframeArray.cpp:94 So either the assumption that BCI = 0 for all cases of do_not_unlock_if_synchronized should be relaxed for this case or this case should set BCI to zero (?) probably the former.

From the hs_err log it looks like there was a deopt and oom in this thread right before the failure. I've pulled out the relevant events from http://spbef16.ru.oracle.com:9502/runs/01240/1240178.RBT/1240178.RBT-49/results/ResultDir/unload009/hs_err_pid74047.log but they can't be ordered properly because they came from different sections and the time stamps aren't precise enough. Event: 62.785 Thread 0x00007fcecb84c000 Exception <a 'java/lang/OutOfMemoryError'> (0x0000000500005600) thrown at [/opt/jprt/T/P1/172249.cthaling/s/hotspot/src/share/vm/gc/shared/collectedHeap.inline.hpp, line 159] Event: 62.785 Thread 0x00007fcecb84c000 Exception <a 'java/lang/OutOfMemoryError'> (0x00000005000056a0) thrown at [/opt/jprt/T/P1/172249.cthaling/s/hotspot/src/share/vm/runtime/deoptimization.cpp, line 826] Event: 62.785 Thread 0x00007fcecb84c000 DEOPT PACKING pc=0x0000000217a018d4 sp=0x0000000229bce4d0 Event: 62.785 Thread 0x00007fcecb84c000 DEOPT UNPACKING pc=0x000000020fe00719 sp=0x0000000229bcd988 mode 1

This looks like a bug. But is this the bug that you're hitting with jvmci?

We don't do generate_counter_incr but isn't it also set here? if (thread->frames_to_pop_failed_realloc() > 0) { // Allocation of scalar replaced object used in this frame // failed. Unconditionally pop the frame. thread->dec_frames_to_pop_failed_realloc(); thread->set_vm_result(h_exception()); // If the method is synchronized we already unlocked the monitor // during deoptimization so the interpreter needs to skip it when // the frame is popped. thread->set_do_not_unlock_if_synchronized(true); return Interpreter::remove_activation_entry(); } there's no guarantee the bci is 0 in this case is there?

[~never], does this make sense to you?

The do_not_unlock_if_synchronized flag should only be set for generate_counter_incr and stack overflow checking and the bci is zero there. Does graal call java code for generate_counter_incr? You may need something like this: // use UnlockFlagSaver to clear and restore the _do_not_unlock_if_synchronized // flag, in case this method triggers classloading which will call into Java. UnlockFlagSaver fs(thread);