JDK-8156659 : assert(CodeCache::find_blob_unsafe(_pc) == _cb) failed: inconsistent
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 9
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • CPU: sparc
  • Submitted: 2016-05-10
  • Updated: 2017-07-26
  • Resolved: 2016-08-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 9
9 b135Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Description
#  Internal Error (/scratch/opt/jprt/T/P1/172429.tkrodrig/s/hotspot/src/cpu/sparc/vm/frame_sparc.cpp:484), pid=8959, tid=22
#  assert(CodeCache::find_blob_unsafe(_pc) == _cb) failed: inconsistent
#
# JRE version: Java(TM) SE Runtime Environment (9.0) (fastdebug build 9-internal+0-2016-05-06-172429.tkrodrig.hs-comp)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 9-internal+0-2016-05-06-172429.tkrodrig.hs-comp, compiled mode, tiered, compressed oops, g1 gc, solaris-sparc)


Current thread (0x00000001018df000):  JavaThread "Thread-0" [_thread_in_Java, id=22, stack(0xffffffff5a800000,0xffffffff5a900000)]

Stack trace decoding failed:

Stack: [0xffffffff5a800000,0xffffffff5a900000],  sp=0xffffffff5a810480,  free space=65k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x19118b8]# Host info: SunOS sc11152430 5.11 11.0 sun4v sparc sun4v
Comments
verified by nightly testing
26-07-2017

http://cr.openjdk.java.net/~neliasso/8156659/webrev.07
17-08-2016

[~dlong] is my reproducer the younger frame isn't interpreted. Even with a compiled frame the wrong register is used for fp: *fr = frame(fr->sender_sp(), fr->sp()); fr->sender_sp() will read the fp from i7 while it is still in o7 since the faulting stack bang is before the SAVE instruction.
15-08-2016

The bad code was introduced as a fix for JDK-8150821 "Crash with assert(!((nmethod*)_cb)->is_deopt_pc(_pc)) failed: invariant broken". It tries to construct a complete frame, but we are traping at a stack bang before SAVE, so it will read the fp from the wrong register in the context. Reverting JDK-8150821 fixes the problem. The problem JDK-8150821 was trying to solve has been fixed in https://bugs.openjdk.java.net/browse/JDK-8029441 "assert(!((nmethod*)_cb)->is_deopt_pc(_pc)) failed: invariant broken"
15-08-2016

It turns out the original pc is getting stored correctly. The problem is that *fr = frame(fr->sender_sp(), fr->sp()); is calling this constructor: frame(intptr_t* sp, intptr_t* younger_sp, bool younger_frame_adjusted_stack = false); which by default says the frame is not interpreted. In this case the younger frame [1] is interpreted, so we missing setting the needed _sp_adjustment_by_callee. With this set, we can find the original pc correctly.
13-08-2016

What I see happening is the top two compiled frames get deoptimized. The top frame is converted into an interpreted frame. Then we continue, and we get a stack overflow when the interpreted frame calls into the compiled method again: [0] recurse() (compiled, no frame pushed yet) [1] recurse() (interpreted) [2] recurse() (compiled, deoptimized) What goes wrong is the saved original pc for [2] is corrupt. This is stored by frame::deoptimize(). When DeoptimizeALot is on, we try to deoptimize threads other than the current thread. Because sparc has register windows, we need to make sure those register windows are flushed first, otherwise frame.pc() will be invalid. So I suspect that the first problem is that register windows weren't flushed, possibly because of a bug in the NeedsDeoptSuspend/is_deopt_suspend logic. The second problem is that this code: *fr = os::fetch_frame_from_ucontext(thread, uc); *fr = frame(fr->sender_sp(), fr->sp()); effectively pops a frame, so instead of getting frame [1] we get frame [2] instead.
12-08-2016

If we are in an nmethod, but the frame isn't complete, then I think our only choice is to use O7 as the pc. Like we do for ic miss and zombie traps: pc = (address)uc->uc_mcontext.gregs[REG_O7]; However, it looks like we are ignoring the PC from the context anyway and popping a frame by hand: *fr = os::fetch_frame_from_ucontext(thread, uc); *fr = frame(fr->sender_sp(), fr->sp()); so now I'm not sure what's going wrong.
11-08-2016

Nils, are we trapping in bang_stack_shadow_pages() after generate_fixed_frame() has set up the frame?
11-08-2016

Restores wrong fp from stack, causing corrupt frame. Restoring fp from i7's location even though SAVE hasn't been called yet (happens after stack bang that triggers SOE).
11-08-2016

Should this be closed as a duplicate of 8153352, or are we still waiting for something? (And JDK-8029441 sounds like a different issue, having to do with async backtraces during profiling, while this one has to do with stack overflow.)
11-08-2016

It seems that the problem reported by this bug has appeared after the fix for JDK-8153352 was pushed. So I think we have to wait for the review process for this issue to be finalized and for Nils to push the fix.
11-08-2016

No problem! Joseph originally filed JDK-8161080 for this issue (I closed it and moved the rule to JDK-8029441). Yes, please edit your comment above to avoid confusion.
13-07-2016

[~thartmann] - I concur. Thanks for adding (a shorter version of) my sighting to JDK-8029441. Should I edit my entry above to avoid any confusion in this bug?
12-07-2016

On review: http://cr.openjdk.java.net/~neliasso/8156659/webrev.02/
17-06-2016

Reliable reproducer: bin/java -cp . -server -Xcomp -XX:+TieredCompilation -XX:+DeoptimizeALot nsk.stress.stack.stack002.stack002
17-06-2016

ILW = assert; few times a week; none = MMH = P3
15-06-2016

assert(CodeCache::find_blob_unsafe(_pc) == _cb) failed: inconsistent pc:0xffffffff44311bc1 cb: 0xffffffff5d931e90 pc is in the stack. That can't be right.
31-05-2016

Have a fix for https://bugs.openjdk.java.net/browse/JDK-8153352 that happens in the same test. Now confirmed as same issue.
24-05-2016

"assert(!((nmethod*)_cb)->is_deopt_pc(_pc)) failed: invariant broken" was fixed by https://bugs.openjdk.java.net/browse/JDK-8150821
20-05-2016

b114-119 I get "# assert(pd != 0L) failed: PcDesc must not be NULL" and "# assert(CodeCache::find_blob_unsafe(_pc) == _cb) failed: inconsistent " b107-113 I get "assert(!((nmethod*)_cb)->is_deopt_pc(_pc)) failed: invariant broken" Product builds runs fine.
20-05-2016

Also got this failure when reproducing: # Internal Error (/tmp/neliasso/jdk9-hs/hotspot/src/share/vm/runtime/sharedRuntime.cpp:3119), pid=3549, tid=112 # assert(pd != 0L) failed: PcDesc must not be NULL
19-05-2016

Removed link to related bug. Found nothing in common.
11-05-2016