United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-7024234 jvmti tests fail assert(!_oops_are_stale) failed: oops are stale on Win-AMD64
JDK-7024234 : jvmti tests fail assert(!_oops_are_stale) failed: oops are stale on Win-AMD64

Details
Type:
Bug
Submit Date:
2011-03-03
Status:
Closed
Updated Date:
2011-04-24
Project Name:
JDK
Resolved Date:
2011-04-24
Component:
hotspot
OS:
windows
Sub-Component:
jvmti
CPU:
x86
Priority:
P2
Resolution:
Fixed
Affected Versions:
hs21
Fixed Versions:
hs21 (b05)

Related Reports
Backport:
Relates:

Sub Tasks

Description
The following VM/NSK tests failed in the 2011.03.02 nightly:

    nsk/jvmti/RedefineClasses/redefclass028
    nsk/jvmti/RedefineClasses/redefclass029
    nsk/jvmti/RedefineClasses/redefclass030
    nsk/jvmti/scenarios/capability/CM02/cm02t001
    nsk/jvmti/scenarios/events/EM02/em02t003
    nsk/jvmti/scenarios/events/EM07/em07t002
    nsk/jvmti/scenarios/hotswap/HS101/hs101t001
    nsk/jvmti/scenarios/hotswap/HS101/hs101t002
    nsk/jvmti/scenarios/hotswap/HS101/hs101t003
    nsk/jvmti/scenarios/hotswap/HS101/hs101t004
    nsk/jvmti/scenarios/hotswap/HS101/hs101t006
    nsk/jvmti/scenarios/hotswap/HS102/hs102t001
    nsk/jvmti/scenarios/hotswap/HS102/hs102t002
    nsk/jvmti/scenarios/multienv/MA10/ma10t006

The assertion failure looks like:

#  Internal Error (c:\temp\jprt\p3\b\161826.zg131198\source\src\share\vm\code/nmethod.hpp:449), pid=3592, tid=4472
#  assert(!_oops_are_stale) failed: oops are stale
#
# JRE version: 7.0-b131
# Java VM: Java HotSpot(TM) 64-Bit Server VM (21.0-b03-internal-201103021618.zg131198.hotspot-fastdebug compiled mode windows-amd64 )


The hs_err_pid file does not contain a stack trace.

                                    

Comments
EVALUATION

This failure mode stems from our (Keith's and mine) misunderstanding
of what it means to lock an nmethod. From a more careful reading of
the code, locking an nmethod locks the nmethod itself (of course) and
the compiled code associated with the nmethod. As Tom stated, it does
not prevent said nmethod from becoming a zombie where the methodOop
and other interesting parts are history...

Zombification of an nmethod is prevented when there are activations
of the nmethod on some thread's stack. We also need to prevent
zombification of the nmethod when the nmethod is referenced by an
event that is enqueued on the ServiceThread. I guess that can be
considered a pseudo-activation.
                                     
2011-03-03
EVALUATION

Tom's theory is spot on:

On 3/3/2011 11:05 AM, Tom Rodriguez wrote:
> The failure occurs in the Service Thread so I think this is
> caused by the changes Keith made to unload notification.  The
> nmethodLocker stuff keeps the nmethod from being freed but it
> can't stop the nmethod from becoming zombie.  Once it's zombie
> the oops are no longer scanned and can become invalid.  The
> method load event reads the oops to report the inlining
> information so I think the nmethod is being made zombie,
> probably because of deopt, before the compiled method load has
> even been posted.  Anyway, that's my theory.
>
> tom

I've verified the zombie nature of the nmethod at the point that
the ServiceThread tries to post the CompiledMethodLoad event.
Fortunately, the zombie nmethod did not try to eat my brains... :-)

I tried making the failure more reproducible by adding a sleep just
before event point in the ServiceThread, but that seemed to make
it harder to reproduce the failure. Go figure.
                                     
2011-03-03
SUGGESTED FIX

See attached 7024234_7024970-webrev-cr[012].tgz files for
code review round 0, round 1 and round 2. The bits from
code review round 2 were sent out to the OpenJDK aliases
as a heads up that these changes are coming in OpenJDK7/HSX-21.
                                     
2011-03-14
SUGGESTED FIX

One minor tweak after the changes shown in 7024234_7024970-webrev-cr2.tgz.
Here are the context diffs:

$ diff -c src/share/vm/code/nmethod.hpp.cr2 src/share/v
m/code/nmethod.hpp
*** src/share/vm/code/nmethod.hpp.cr2   Mon Mar 14 09:00:33 2011
--- src/share/vm/code/nmethod.hpp       Tue Mar 15 06:27:13 2011
***************
*** 527,533 ****
   public:
    // When true is returned, it is unsafe to remove this nmethod even if
    // it is a zombie, since the VM or the ServiceThread might still be
!   // using it. Should only be called from a safepoint.
    bool is_locked_by_vm() const                    { return _lock_count >0; }
  
    // See comment at definition of _last_seen_on_stack
--- 527,533 ----
   public:
    // When true is returned, it is unsafe to remove this nmethod even if
    // it is a zombie, since the VM or the ServiceThread might still be
!   // using it.
    bool is_locked_by_vm() const                    { return _lock_count >0; }
  
    // See comment at definition of _last_seen_on_stack
                                     
2011-03-15
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot-rt/hotspot/rev/216d916d5c12
                                     
2011-03-16
EVALUATION

http://hg.openjdk.java.net/jdk7/hotspot/hotspot/rev/216d916d5c12
                                     
2011-03-17



Hardware and Software, Engineered to Work Together