JDK-6599425 : OopMapCache::lookup() can cause later crash or assert() failure
  • Type: Bug
  • Status: Closed
  • Resolution: Fixed
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P4
  • Affected Version: 7
  • OS: generic
  • CPU: generic
  • Submit Date: 2007-08-30
  • Updated Date: 2012-10-08
  • Resolved Date: 2008-04-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availabitlity Release.

To download the current JDK release, click here.
JDK 6 JDK 7
6u6Fixed 7Fixed
Related Reports
Relates :  
Description
During my investigation of the following bug:

    6497639 4/3 Profiling Swing application caused JVM crash

I tripped across this problem with the OopMapCache.

The failure modes that I observed:

- SIGSEGV in markOopDesc::prototype_for_object()
- Internal Error (src/share/vm/oops/markOop.inline.hpp:80)
  Error: assert(prototype_header == prototype() ||
         prototype_header->has_bias_pattern(),"corrupt prototype header")

Currently this failure is only reproducible using the testing
set up that I have for 6497639.


Bug prerequisites:

- Client VM with sharing enabled
  Update: reproduced this failure on 2008.01.11 without sharing;
  it took more playing with Java2Demo, but it crahed
- Full GC in progress; we are in GC phase 3
- target methodOop is from a shared class
- target methodOop is being forwarded
- target methodOop is found on the stack of a JavaThread
- the instanceKlass' OopMapCache has already been forwarded and
  adjusted so the entry that refers to our method contains the
  forwardee methodOop

We're in the middle of a Full GC. The VMThread is walking JavaThread
stacks and runs into a methodOop that refers to an object that is being
forwarded. The VMThread creates a methodHandle via
methodOopDesc::mask_for(). The methodHandle is eventually passed to
OopMapCache::lookup(). Because this methodHandle is owned by the
VMThread, it has not yet been adjusted to the new location. The
OopMapCache's methodOop has already been adjusted to the new location.
The lookup() code does not find a match in the OopMapCache so the
VMThread's methodOop is added to the OopMapCache.  Unfortunately, the
methodOop's mark word still contains the forwarding info so things blow
up later.

Here is a snippet of a stack from a subsequent blow up:

  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [8] markOopDesc::prototype_for_object(0xd4c047d0, 0xd05c0000, 0x1, 0xfee4a800, 0xd929f5d8, 0x15c), at 0xfec01234
  [9] MarkSweep::mark_object(0xd4c047d0, 0x20, 0x1, 0xfee45000, 0x0, 0x0), at 0xfec015c8
  [10] MarkSweep::_mark_and_push(0xa3610, 0xa3610, 0x21, 0x80, 0xd4c047d0, 0xfee4b000), at 0xfe98c8d0
  [11] OopMapCache::oop_iterate(0x3fd2b0, 0xfee4b0f0, 0xfee4b1b0, 0xfe98d240, 0x3, 0x2), at 0xfe986ae8
  [12] instanceKlassKlass::oop_follow_contents(0xd8c00890, 0x3fd2b0, 0xfee4b000, 0xd8c84740, 0xd845c018, 0x3), at 0xfe98d05c
  [13] MarkSweep::FollowRootClosure::do_oop(0xd8c00888, 0x2950940, 0xfee4b0d0, 0xfc, 0x3f, 0x40), at 0xfe98c700
  [14] SystemDictionary::shared_oops_do(0xfee4b0f8, 0x18000, 0xfee4b194, 0xfe98c660, 0xfee38000, 0x18210), at 0xfeca3bf0
  [15] Universe::oops_do(0xfee4b0f8, 0x0, 0x33188, 0x0, 0xfee38000, 0xfee50800), at 0xfecbe820
  [16] SharedHeap::process_strong_roots(0x33090, 0x1, 0x2, 0xfee4b0f8, 0xfee4b0f8, 0x0), at 0xfec66ad4
  [17] GenCollectedHeap::gen_process_strong_roots(0x33090, 0x1, 0x0, 0x1, 0x0, 0xfee4b0f8), at 0xfeaafc70
  [18] GenMarkSweep::mark_sweep_phase1(0x1, 0xfee4b000, 0xfed73000, 0x2, 0x30e28, 0x0), at 0xfeab228c

The forwarding info in the mark word is used in the
markOopDesc::prototype_for_object() call. The failure modes
I've seen are SIGSEGVs and assertion failures.

Comments
SUGGESTED FIX The fix for this bug (6599425) has been forward ported to Dolphin/JDK7-B22, 160_10-baseline, and Dolphin/JDK7 main/baseline. See 6599425-webrev-cr1-160_10.tgz and 6599425-webrev-cr1-170_main.tgz.
2008-01-18

SUGGESTED FIX Please see the attached 6599425-webrev-cr1-all.tgz and 6599425-webrev-cr1-delta.tgz files for the revised version of the proposed fix after Code Review Round 0. This proposed fix is relative to 1.6.0_04.
2008-01-17

SUGGESTED FIX <note deleted>
2007-12-20

SUGGESTED FIX Please see the attached 6599425-webrev-cr0.tgz file for the proposed fix. This proposed fix is relative to 1.6.0_04.
2007-12-19

EVALUATION Here is a snippet of stack from an intentional crash in OopMapCache::lookup() along with notes about what went wrong. ---- called from signal handler with signal 11 (SIGSEGV) ------ [9] OopMapCache::lookup(this = 0x224e80, method = CLASS, bci = 0, entry_for = 0xfbbfed64), line 646 in "oopMapCache.cpp" The above SIGSEGV was intentionally caused by: RC_TRACE(0x08000000, ("%s: method[%d]: %s(%s), bci=%u", _debug_name, (probe + i) % _size, entry->_method->name()->as_C_string(), entry->_method->signature()->as_C_string(), bci)); if (UseNewCode9) { int value = **((int **)(void **)&entry->_method); if (value != 0x1 && value != 0x3) { char *cp = NULL; *cp = NULL; } } when a methodOop is added to the OopMapCache where the mark word is not 0x1 or 0x3. What I'm trying to catch is the addition of an oop that is in the middle of being forwarded. I've copied interesting snippets from the doit.log.34 file: This debug output occurs during a Full GC right after "java.lang.Thread.sleep((J))V" has been redefined. The output shows the new version of the methodOop being added to the cache. The previous version was flushed from the cache earlier. YYY: ik2: lookup: &m=0xfbbfeb94, *m=0xd48c5ec8, **m=0x3, bci=0 YYY: ik2: cmp to method[0]: &m=0x224f88, *m=0x0, bci=0 YYY: ik2: method[0]: NOT matched YYY: ik2: cmp to method[1]: &m=0x224fac, *m=0x0, bci=0 YYY: ik2: method[1]: NOT matched YYY: ik2: cmp to method[2]: &m=0x224fd0, *m=0x0, bci=0 YYY: ik2: method[2]: NOT matched YYY: ik2: adding method[0]: &m=0x224f88, *m=0xd48c5ec8, **m=0x3, bci=0 RedefineClasses-0x8000000: ik2: method[0]: sleep((J)V), bci=0 <snip> // our redefined methodOop gets forwarded during the Full GC: XXX: forward_to: this2=0xd48c5ec8, *this2=0x3, &fwd_p=0xfbbff1f8, *fwd_p=0xd4807bb0, **fwd_p=0x1 <snip> // our redefined methodOop gets visited by the // RecursiveAdjustSharedObjectClosure (RASOC) helper which // results in our redefined methodOop getting adjusted: XXX: RASOC::do_oop: &method2=0xfbbfed30, *method2=0xd48c5ec8, **method2=0xd4807bb3 XXX: RASOC::do_oop end: &method2=0xfbbfed30, *method2=0xd48c5ec8, **method2=0xd4807bb3 XXX: _adjust_pointer: &method2=0xfbbfecd8, *method2=0xd48c5ec8, **method2=0xd4807bb3 <snip> // OopMapCache::lookup() is called with a methodHandle wrapping // our forwarded methodOop: XXX: _adjust_pointer: &method2=0xfbbfec98, *method2=0xd48c5ec8, **method2=0xd4807bb3 YYY: ik2: lookup: &m=0xfbbfebac, *m=0xd48c5ec8, **m=0xd4807bb3, bci=0 YYY: ik2: cmp to method[0]: &m=0x224f88, *m=0xd4807bb0, bci=0 YYY: ik2: method[0]: NOT matched YYY: ik2: cmp to method[1]: &m=0x224fac, *m=0x0, bci=0 YYY: ik2: method[1]: NOT matched YYY: ik2: cmp to method[2]: &m=0x224fd0, *m=0x0, bci=0 YYY: ik2: method[2]: NOT matched YYY: ik2: adding method[1]: &m=0x224fac, *m=0xd48c5ec8, **m=0xd4807bb3, bci=0 RedefineClasses-0x8000000: ik2: method[1]: sleep((J)V), bci=0 # # An unexpected error has been detected by Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xfeac79b4, pid=23423, tid=3 Notice that the method in cache entry[0] has already completed forwarding which is why we don't get a match with this lookup. So the lookup() code adds the partially forwarded methodOop that we have in hand to the cache. However, since the cache entries have already completed forwarding, the entry we just added will not be adjusted. The existence of the forwardee address in the mark word will cause us to blow up later. Verifying a few things in the debugger. Here is the "methodHandle method" parameter: (dbx) print method method = { method._handle = 0x385a86c } (dbx) x 0x385a86c/X 0x0385a86c: 0xd48c5ec8 (dbx) x 0xd48c5ec8/X 0xd48c5ec8: 0xd4807bb3 And here is cache entry[0]: (dbx) print _array[0] _array[0] = { _allocation = -235802127 _array[0]._method = 0xd4807bb0 _bci = 0 _mask_size = 2 _expression_stack_size = 0 _bit_mask = (0, 0) _resource_allocate_bit_mask = true } The debugger agrees with the debug output. Let's go find where the methodHandle parameter came from. [10] instanceKlass::mask_for(this = 0xd8814e38, method = CLASS, bci = 0, entry_for = 0xfbbfed64), line 670 in "instanceKlass.cpp" [11] methodOopDesc::mask_for(this = 0xd48c5ec8, bci = 0, mask = 0xfbbfed64), line 161 in "methodOop.cpp" methodOopDesc::mask_for() allocates a local methodHandle: Thread* myThread = Thread::current(); methodHandle h_this(myThread, this); (dbx) print h_this h_this = { h_this._handle = 0x385a86c } (dbx) x 0x385a86c/X 0x0385a86c: 0xd48c5ec8 (dbx) x 0xd48c5ec8/X 0xd48c5ec8: 0xd4807bb3 Our current thread is the VMThread so that is where the handle is allocated, but we are processing the oops in a JavaThread. [12] frame::oops_interpreted_do(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80, query_oop_map_cache = true), line 967 in "frame.cpp" [13] frame::oops_do_internal(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80, use_interpreter_oop_map_cache = true), line 1211 in "frame.cpp" [14] frame::oops_do(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80), line 399 in "frame.hpp" [15] JavaThread::oops_do(this = 0x134400, f = 0xff08d738), line 2311 in "thread.cpp" Just for grins, which JavaThread are we processing? (dbx) print this this = 0x134400 Looks like the JavaThread is t@23 which is in a JVM_Sleep()/java.lang.Thread.sleep() call. [16] Threads::oops_do(f = 0xff08d738), line 3452 in "thread.cpp" Here is Threads::oops_do(): ALL_JAVA_THREADS(p) { p->oops_do(f); } VMThread::vm_thread()->oops_do(f); So we haven't done the VMThread's oops yet which is why the VMThread's methodHandle is only forwarded, but not yet adjusted. [17] SharedHeap::process_strong_roots(this = 0x50c50, collecting_perm_gen = true, so = SO_AllClasses, roots = 0xff08d738, perm_blk = 0xff08d738), line 115 in "sharedHeap.cpp" [18] GenCollectedHeap::gen_process_strong_roots(this = 0x50c50, level = 1, younger_gens_as_roots = false, collecting_perm_gen = true, so = SO_AllClasses, older_gens = 0xff08d738, not_older_gens = 0xff08d738), line 635 in "genCollectedHeap.cpp" [19] GenMarkSweep::mark_sweep_phase3(level = 1), line 339 in "genMarkSweep.cpp" [20] GenMarkSweep::invoke_at_safepoint(level = 1, rp = 0xab730, clear_all_softrefs = false), line 75 in "genMarkSweep.cpp" [21] OneContigSpaceCardGeneration::collect(this = 0x4c958, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false), line 406 in "generation.cpp" [22] TenuredGeneration::collect(this = 0x4c958, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false), line 304 in "tenuredGeneration.cpp" [23] GenCollectedHeap::do_collection(this = 0x50c50, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false, max_level = 1), line 529 in "genCollectedHeap.cpp" [24] GenCollectedHeap::do_full_collection(this = 0x50c50, clear_all_soft_refs = false, max_level = 1), line 838 in "genCollectedHeap.cpp" [25] VM_GenCollectFull::doit(this = 0xfab7f368), line 147 in "vmGCOperations.cpp" [26] VM_Operation::evaluate(this = 0xfab7f368), line 49 in "vm_operations.cpp" [27] VMThread::evaluate_operation(this = 0xb4000, op = 0xfab7f368), line 336 in "vmThread.cpp" [28] VMThread::loop(this = 0xb4000), line 445 in "vmThread.cpp" [29] VMThread::run(this = 0xb4000), line 251 in "vmThread.cpp" The complete thread dump is attached as threads.log.34.
2007-08-30