United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-6599425 : OopMapCache::lookup() can cause later crash or assert() failure

Details
Type:
Bug
Submit Date:
2007-08-30
Status:
Closed
Updated Date:
2012-10-08
Project Name:
JDK
Resolved Date:
2008-04-18
Component:
hotspot
OS:
generic
Sub-Component:
runtime
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
7
Fixed Versions:
hs12 (b02)

Related Reports
Backport:
Backport:
Backport:
Backport:
Backport:
Relates:

Sub Tasks

Description
During my investigation of the following bug:

    6497639 4/3 Profiling Swing application caused JVM crash

I tripped across this problem with the OopMapCache.

The failure modes that I observed:

- SIGSEGV in markOopDesc::prototype_for_object()
- Internal Error (src/share/vm/oops/markOop.inline.hpp:80)
  Error: assert(prototype_header == prototype() ||
         prototype_header->has_bias_pattern(),"corrupt prototype header")

Currently this failure is only reproducible using the testing
set up that I have for 6497639.


Bug prerequisites:

- Client VM with sharing enabled
  Update: reproduced this failure on 2008.01.11 without sharing;
  it took more playing with Java2Demo, but it crahed
- Full GC in progress; we are in GC phase 3
- target methodOop is from a shared class
- target methodOop is being forwarded
- target methodOop is found on the stack of a JavaThread
- the instanceKlass' OopMapCache has already been forwarded and
  adjusted so the entry that refers to our method contains the
  forwardee methodOop

We're in the middle of a Full GC. The VMThread is walking JavaThread
stacks and runs into a methodOop that refers to an object that is being
forwarded. The VMThread creates a methodHandle via
methodOopDesc::mask_for(). The methodHandle is eventually passed to
OopMapCache::lookup(). Because this methodHandle is owned by the
VMThread, it has not yet been adjusted to the new location. The
OopMapCache's methodOop has already been adjusted to the new location.
The lookup() code does not find a match in the OopMapCache so the
VMThread's methodOop is added to the OopMapCache.  Unfortunately, the
methodOop's mark word still contains the forwarding info so things blow
up later.

Here is a snippet of a stack from a subsequent blow up:

  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [8] markOopDesc::prototype_for_object(0xd4c047d0, 0xd05c0000, 0x1, 0xfee4a800, 0xd929f5d8, 0x15c), at 0xfec01234
  [9] MarkSweep::mark_object(0xd4c047d0, 0x20, 0x1, 0xfee45000, 0x0, 0x0), at 0xfec015c8
  [10] MarkSweep::_mark_and_push(0xa3610, 0xa3610, 0x21, 0x80, 0xd4c047d0, 0xfee4b000), at 0xfe98c8d0
  [11] OopMapCache::oop_iterate(0x3fd2b0, 0xfee4b0f0, 0xfee4b1b0, 0xfe98d240, 0x3, 0x2), at 0xfe986ae8
  [12] instanceKlassKlass::oop_follow_contents(0xd8c00890, 0x3fd2b0, 0xfee4b000, 0xd8c84740, 0xd845c018, 0x3), at 0xfe98d05c
  [13] MarkSweep::FollowRootClosure::do_oop(0xd8c00888, 0x2950940, 0xfee4b0d0, 0xfc, 0x3f, 0x40), at 0xfe98c700
  [14] SystemDictionary::shared_oops_do(0xfee4b0f8, 0x18000, 0xfee4b194, 0xfe98c660, 0xfee38000, 0x18210), at 0xfeca3bf0
  [15] Universe::oops_do(0xfee4b0f8, 0x0, 0x33188, 0x0, 0xfee38000, 0xfee50800), at 0xfecbe820
  [16] SharedHeap::process_strong_roots(0x33090, 0x1, 0x2, 0xfee4b0f8, 0xfee4b0f8, 0x0), at 0xfec66ad4
  [17] GenCollectedHeap::gen_process_strong_roots(0x33090, 0x1, 0x0, 0x1, 0x0, 0xfee4b0f8), at 0xfeaafc70
  [18] GenMarkSweep::mark_sweep_phase1(0x1, 0xfee4b000, 0xfed73000, 0x2, 0x30e28, 0x0), at 0xfeab228c

The forwarding info in the mark word is used in the
markOopDesc::prototype_for_object() call. The failure modes
I've seen are SIGSEGVs and assertion failures.

                                    

Comments
EVALUATION

Here is a snippet of stack from an intentional crash in
OopMapCache::lookup() along with notes about what went wrong.

  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [9] OopMapCache::lookup(this = 0x224e80, method = CLASS, bci = 0, entry_for = 0xfbbfed64), line 646 in "oopMapCache.cpp"

The above SIGSEGV was intentionally caused by:

          RC_TRACE(0x08000000, ("%s: method[%d]: %s(%s), bci=%u", _debug_name,
            (probe + i) % _size, entry->_method->name()->as_C_string(),
            entry->_method->signature()->as_C_string(), bci));
    if (UseNewCode9) {
    int value = **((int **)(void **)&entry->_method);
    if (value != 0x1 && value != 0x3) {
    char *cp = NULL;
    *cp = NULL;
    }
    }

when a methodOop is added to the OopMapCache where the mark word is not
0x1 or 0x3. What I'm trying to catch is the addition of an oop that is
in the middle of being forwarded. I've copied interesting snippets
from the doit.log.34 file:

This debug output occurs during a Full GC right after
"java.lang.Thread.sleep((J))V" has been redefined. The output shows the
new version of the methodOop being added to the cache. The previous
version was flushed from the cache earlier.

    YYY: ik2: lookup: &m=0xfbbfeb94, *m=0xd48c5ec8, **m=0x3, bci=0
    YYY: ik2: cmp to method[0]: &m=0x224f88, *m=0x0, bci=0
    YYY: ik2: method[0]: NOT matched
    YYY: ik2: cmp to method[1]: &m=0x224fac, *m=0x0, bci=0
    YYY: ik2: method[1]: NOT matched
    YYY: ik2: cmp to method[2]: &m=0x224fd0, *m=0x0, bci=0
    YYY: ik2: method[2]: NOT matched
    YYY: ik2: adding method[0]: &m=0x224f88, *m=0xd48c5ec8, **m=0x3, bci=0
    RedefineClasses-0x8000000: ik2: method[0]: sleep((J)V), bci=0

    <snip>

    // our redefined methodOop gets forwarded during the Full GC:

    XXX: forward_to: this2=0xd48c5ec8, *this2=0x3, &fwd_p=0xfbbff1f8, *fwd_p=0xd4807bb0, **fwd_p=0x1

    <snip>

    // our redefined methodOop gets visited by the
    // RecursiveAdjustSharedObjectClosure (RASOC) helper which
    // results in our redefined methodOop getting adjusted:

    XXX: RASOC::do_oop: &method2=0xfbbfed30, *method2=0xd48c5ec8, **method2=0xd4807bb3
    XXX: RASOC::do_oop end: &method2=0xfbbfed30, *method2=0xd48c5ec8, **method2=0xd4807bb3
    XXX: _adjust_pointer: &method2=0xfbbfecd8, *method2=0xd48c5ec8, **method2=0xd4807bb3

    <snip>

    // OopMapCache::lookup() is called with a methodHandle wrapping
    // our forwarded methodOop:

    XXX: _adjust_pointer: &method2=0xfbbfec98, *method2=0xd48c5ec8, **method2=0xd4807bb3
    YYY: ik2: lookup: &m=0xfbbfebac, *m=0xd48c5ec8, **m=0xd4807bb3, bci=0
    YYY: ik2: cmp to method[0]: &m=0x224f88, *m=0xd4807bb0, bci=0
    YYY: ik2: method[0]: NOT matched
    YYY: ik2: cmp to method[1]: &m=0x224fac, *m=0x0, bci=0
    YYY: ik2: method[1]: NOT matched
    YYY: ik2: cmp to method[2]: &m=0x224fd0, *m=0x0, bci=0
    YYY: ik2: method[2]: NOT matched
    YYY: ik2: adding method[1]: &m=0x224fac, *m=0xd48c5ec8, **m=0xd4807bb3, bci=0
    RedefineClasses-0x8000000: ik2: method[1]: sleep((J)V), bci=0
    #
    # An unexpected error has been detected by Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0xfeac79b4, pid=23423, tid=3

Notice that the method in cache entry[0] has already completed
forwarding which is why we don't get a match with this lookup. So the
lookup() code adds the partially forwarded methodOop that we have in
hand to the cache. However, since the cache entries have already
completed forwarding, the entry we just added will not be adjusted. The
existence of the forwardee address in the mark word will cause us to
blow up later.

Verifying a few things in the debugger. Here is the
"methodHandle method" parameter:

    (dbx) print method
    method = {
        method._handle                = 0x385a86c
    }
    (dbx) x 0x385a86c/X
    0x0385a86c:      0xd48c5ec8
    (dbx) x 0xd48c5ec8/X
    0xd48c5ec8:      0xd4807bb3

And here is cache entry[0]:

(dbx) print _array[0]
_array[0] = {
    _allocation                    = -235802127
    _array[0]._method              = 0xd4807bb0
    _bci                           = 0
    _mask_size                     = 2
    _expression_stack_size         = 0
    _bit_mask                      = (0, 0)
    _resource_allocate_bit_mask    = true
}

The debugger agrees with the debug output. Let's go find
where the methodHandle parameter came from.

  [10] instanceKlass::mask_for(this = 0xd8814e38, method = CLASS, bci = 0, entry_for = 0xfbbfed64), line 670 in "instanceKlass.cpp"
  [11] methodOopDesc::mask_for(this = 0xd48c5ec8, bci = 0, mask = 0xfbbfed64), line 161 in "methodOop.cpp"

methodOopDesc::mask_for() allocates a local methodHandle:

    Thread* myThread    = Thread::current();
    methodHandle h_this(myThread, this);

    (dbx) print h_this
    h_this = {
        h_this._handle                = 0x385a86c
    }
    (dbx) x 0x385a86c/X
    0x0385a86c:      0xd48c5ec8
    (dbx) x 0xd48c5ec8/X
    0xd48c5ec8:      0xd4807bb3

Our current thread is the VMThread so that is where the handle is
allocated, but we are processing the oops in a JavaThread.

  [12] frame::oops_interpreted_do(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80, query_oop_map_cache = true), line 967 in "frame.cpp"
  [13] frame::oops_do_internal(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80, use_interpreter_oop_map_cache = true), line 1211 in "frame.cpp"
  [14] frame::oops_do(this = 0xfbbfef64, f = 0xff08d738, map = 0xfbbfef80), line 399 in "frame.hpp"
  [15] JavaThread::oops_do(this = 0x134400, f = 0xff08d738), line 2311 in "thread.cpp"

Just for grins, which JavaThread are we processing?

    (dbx) print this
    this = 0x134400

Looks like the JavaThread is t@23 which is in a
JVM_Sleep()/java.lang.Thread.sleep() call.

  [16] Threads::oops_do(f = 0xff08d738), line 3452 in "thread.cpp"

Here is Threads::oops_do():

    ALL_JAVA_THREADS(p) {
      p->oops_do(f);
    }
    VMThread::vm_thread()->oops_do(f);

So we haven't done the VMThread's oops yet which is why the VMThread's
methodHandle is only forwarded, but not yet adjusted.

  [17] SharedHeap::process_strong_roots(this = 0x50c50, collecting_perm_gen = true, so = SO_AllClasses, roots = 0xff08d738, perm_blk = 0xff08d738), line 115 in "sharedHeap.cpp"
  [18] GenCollectedHeap::gen_process_strong_roots(this = 0x50c50, level = 1, younger_gens_as_roots = false, collecting_perm_gen = true, so = SO_AllClasses, older_gens = 0xff08d738, not_older_gens = 0xff08d738), line 635 in "genCollectedHeap.cpp"
  [19] GenMarkSweep::mark_sweep_phase3(level = 1), line 339 in "genMarkSweep.cpp"
  [20] GenMarkSweep::invoke_at_safepoint(level = 1, rp = 0xab730, clear_all_softrefs = false), line 75 in "genMarkSweep.cpp"
  [21] OneContigSpaceCardGeneration::collect(this = 0x4c958, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false), line 406 in "generation.cpp"
  [22] TenuredGeneration::collect(this = 0x4c958, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false), line 304 in "tenuredGeneration.cpp"
  [23] GenCollectedHeap::do_collection(this = 0x50c50, full = true, clear_all_soft_refs = false, size = 0, is_tlab = false, max_level = 1), line 529 in "genCollectedHeap.cpp"
  [24] GenCollectedHeap::do_full_collection(this = 0x50c50, clear_all_soft_refs = false, max_level = 1), line 838 in "genCollectedHeap.cpp"
  [25] VM_GenCollectFull::doit(this = 0xfab7f368), line 147 in "vmGCOperations.cpp"
  [26] VM_Operation::evaluate(this = 0xfab7f368), line 49 in "vm_operations.cpp"  [27] VMThread::evaluate_operation(this = 0xb4000, op = 0xfab7f368), line 336 in "vmThread.cpp"
  [28] VMThread::loop(this = 0xb4000), line 445 in "vmThread.cpp"
  [29] VMThread::run(this = 0xb4000), line 251 in "vmThread.cpp"

The complete thread dump is attached as threads.log.34.
                                     
2007-08-30
SUGGESTED FIX

Please see the attached 6599425-webrev-cr0.tgz file for
the proposed fix. This proposed fix is relative to 1.6.0_04.
                                     
2007-12-19
SUGGESTED FIX

<note deleted>
                                     
2007-12-20
SUGGESTED FIX

Please see the attached 6599425-webrev-cr1-all.tgz and
6599425-webrev-cr1-delta.tgz files for the revised version
of the proposed fix after Code Review Round 0. This proposed 
fix is relative to 1.6.0_04.
                                     
2008-01-17
SUGGESTED FIX

The fix for this bug (6599425) has been forward ported to
Dolphin/JDK7-B22, 160_10-baseline, and Dolphin/JDK7 main/baseline.
See 6599425-webrev-cr1-160_10.tgz and 6599425-webrev-cr1-170_main.tgz.
                                     
2008-01-18



Hardware and Software, Engineered to Work Together