JDK-6729714 : intermittent failures of assert(orig_key != 0L,"jni weak reference cleared!!")
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: hs12,7
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic,solaris
  • CPU: generic
  • Submitted: 2008-07-24
  • Updated: 2012-02-01
  • Resolved: 2011-01-11
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
The test is
     nsk/jdi/stress/serial/heapwalking002 

See for example nightly result:

/java/sqe/results/vm/gtee/JDK7/NIGHTLY/VM/2008-07-21/GC_Baseline-XXParallel/vm/solaris-amd64/server/mixed/vm-solaris-amd64_server_mixed_nsk.quick-jdi.testlist2008-07-21-22-51-58/ResultDir/heapwalking001/heapwalking001.log

Also, test 
      nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001 
has been seen to fail with the same error.


The error is 
[2008-07-20T12:42:19.37] debugee.stderr> Debuggee nsk.share.jdi.AbstractJDIDebuggee : sending the command: ready
[2008-07-20T13:08:20.46] debugee.stdout> # To suppress the following error report, specify this argument
[2008-07-20T13:08:20.46] debugee.stdout> # after -XX: or in .hotspotrc:  SuppressErrorAt=/jvmtiTagMap.cpp:280
[2008-07-20T13:08:20.46] debugee.stdout> #
[2008-07-20T13:08:20.46] debugee.stdout> # A fatal error has been detected by the Java Runtime Environment:
[2008-07-20T13:08:20.46] debugee.stdout> #
[2008-07-20T13:08:20.46] debugee.stdout> #  Internal Error (/tmp/jprt-jprtadm/P1/B/172814.iv159533/source/src/share/vm/prims/jvmtiTagMap.cpp:280), pid=15473, tid=16
[2008-07-20T13:08:20.46] debugee.stdout> #  Error: assert(orig_key != 0L,"jni weak reference cleared!!")
[2008-07-20T13:08:20.46] debugee.stdout> #
[2008-07-20T13:08:20.46] debugee.stdout> # Java VM: OpenJDK 64-Bit Server VM (14.0-b01-2008-07-17-172814.iv159533.numa-solaris-config-fastdebug mixed mode solaris-sparc )
[2008-07-20T13:08:20.46] debugee.stdout> # An error report file with more information is saved as:
[2008-07-20T13:08:20.46] debugee.stdout> # /export/local/6653.JDK7.NIGHTLY.VM+solaris-sparcv9_server_mixed_nsk.quick-jdi.testlist/results/ResultDir/VMOutOfMemoryException001/hs_err_pid15473.log
[2008-07-20T13:08:20.46] debugee.stdout> #
[2008-07-20T13:08:20.46] debugee.stdout> # If you would like to submit a bug report, please visit:
[2008-07-20T13:10:50.49] debugee.stdout> #   http://java.sun.com/webapps/bugreport/crash.jsp
[2008-07-20T13:10:50.49] debugee.stdout> #
[2008-07-20T13:10:50.49] debugee.stdout> VM option '-PrintVMOptions'
[2008-07-20T13:10:50.49] debugee.stdout> VM option '+UseParallelGC'
[2008-07-20T13:10:50.49] debugee.stdout> VM option '+UseParallelOldGC'

At least so far, the failure has only been seen wihen the debuggee is run with -XX:+UseParallelOldGC.
The failure is intermittent.

Update: This original sighting occurred in the 2008.07.21 nightly
and used HSX-14-B01 bits. That is after the fix for 6539517 was
available so there is still some issue with the Parallel collectors
at this point in time.
Test 
	nsk/jdi/ObjectReference/referringObjects/referringObjects003
failed with the same assertion during PIT for hs14b08.

Link to the log:
http://sqeweb.sfbay.sun.com/nfs/comp/vm/misc/results/vm/gtee/HSX/PIT/VM/14/b08/jdk7b41/fastdebug/vm/windows-i586/client/comp/windows-i586_client_comp_nsk.jdi.testlist/ResultDir/referringObjects003/referringObjects003.log

Since test was run on windows with flag '-client' SerialGC should be used, so looks like this failure isn't specific for -XX:+UseParallelOldGC.

Update: This sighting occurred in the HSX-14-B08 PIT so the
bug introduced by 6711316 is present in the GC framework
collectors. It is very likely that this sighting is fixed by
the work on 6862295.
Another sighting in nightly testing for another test:

New nsk.quick-jdi failures (from 2008.12.17)
*   nsk/jdi/stress/serial/mixed001
        This test failed the following assertion on Win32 Server VM
        (machine cheesepuff):

            Internal Error (src/share/vm/prims/jvmtiTagMap.cpp:280)
            Error: assert(orig_key != 0,"jni weak reference cleared!!")

        This failure mode is covered by the following bug:

            6729714 4/4 nsk/jdi/stress/serial/heapwalking002 fails

        I will copy this entry to 6729714.

Update: This sighting occurred in the 2008.12.17 nightly so the
bug introduced by 6711316 is present in the GC framework
collectors. It is very likely that this sighting is fixed by
the work on 6862295.
nsk/jdi/stress/serial/heapwalking002 continues to fail in
Parallel GC, but with different failure modes:

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-20/GC_Baseline-XXParallel/vm/solaris-sparcv9/server/mixed/solaris-sparcv9_server_mixed_nsk.quick-jdi.testlist/ResultDir/heapwalking002/heapwalking002.log


[2009-07-21T14:19:44.93] debugee.stderr> # ERROR: Unexpected exception in debuggee: nsk.share.TestBug: Test execution error: WeakReference was collected
[2009-07-21T14:19:44.93] # ERROR: Unexpected exception during test execution(debugger: nsk.jdi.ReferenceType.instances.instances003.instances003@8f0c85e): java.lang.NullPointerException
[2009-07-21T14:19:44.93] java.lang.NullPointerException
[2009-07-21T14:19:44.93] 	at nsk.share.jdi.TestDebuggerType2.isDebuggeeReady(TestDebuggerType2.java:218)
[2009-07-21T14:19:44.93] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.testClass(instances003.java:82)
[2009-07-21T14:19:44.93] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.doTest(instances003.java:105)
[2009-07-21T14:19:44.93] 	at nsk.share.jdi.SerialExecutionDebugger.executeTests(SerialExecutionDebugger.java:271)
[2009-07-21T14:19:44.93] 	at nsk.share.jdi.SerialExecutionDebugger.doTest(SerialExecutionDebugger.java:210)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.TestDebuggerType2.runIt(TestDebuggerType2.java:192)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.SerialExecutionDebugger.main(SerialExecutionDebugger.java:42)
[2009-07-21T14:19:45.07] # ERROR: Unexpected exception: java.lang.NullPointerException
[2009-07-21T14:19:45.07] java.lang.NullPointerException
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.TestDebuggerType2.isDebuggeeReady(TestDebuggerType2.java:218)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.SerialExecutionDebugger.executeTests(SerialExecutionDebugger.java:292)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.SerialExecutionDebugger.doTest(SerialExecutionDebugger.java:210)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.TestDebuggerType2.runIt(TestDebuggerType2.java:192)
[2009-07-21T14:19:45.07] 	at nsk.share.jdi.SerialExecutionDebugger.main(SerialExecutionDebugger.java:42)
[2009-07-21T14:19:45.07] # ERROR: TEST FAILED: debuggee's process finished with status: 97
[2009-07-21T14:19:45.07] TEST FAILED


http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-06-06/GC_Baseline-XXParallel/vm/solaris-sparcv9/server/mixed/solaris-sparcv9_server_mixed_nsk.quick-jdi.testlist/ResultDir/heapwalking002/heapwalking002.log

[2009-06-07T14:02:29.35] # ERROR: Wrong value was returned  by VirtualMachine.instanceCounts(java.lang.String): 1183, expected: >= 1222
[2009-06-07T14:02:29.35] # ERROR: Unexpected size of referenceType.instances(java.lang.String): 1183, expected: >= 1222
[2009-06-07T14:02:29.35] # ERROR: Unexpected exception during test execution(debugger: nsk.jdi.ReferenceType.instances.instances003.instances003@528acf6e): java.lang.IndexOutOfBoundsException: Index: 11, Size: 11
[2009-06-07T14:02:29.35] java.lang.IndexOutOfBoundsException: Index: 11, Size: 11
[2009-06-07T14:02:29.35] 	at java.util.ArrayList.rangeCheck(ArrayList.java:570)
[2009-06-07T14:02:29.35] 	at java.util.ArrayList.get(ArrayList.java:348)
[2009-06-07T14:02:29.35] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.testClass(instances003.java:77)
[2009-06-07T14:02:29.37] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.doTest(instances003.java:105)
[2009-06-07T14:02:29.37] 	at nsk.share.jdi.SerialExecutionDebugger.executeTests(SerialExecutionDebugger.java:271)
[2009-06-07T14:02:29.37] 	at nsk.share.jdi.SerialExecutionDebugger.doTest(SerialExecutionDebugger.java:210)
[2009-06-07T14:02:29.37] 	at nsk.share.jdi.TestDebuggerType2.runIt(TestDebuggerType2.java:192)
[2009-06-07T14:02:29.37] 	at nsk.share.jdi.SerialExecutionDebugger.main(SerialExecutionDebugger.java:42)
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee: received the command: quit
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee: exiting
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee nsk.share.jdi.HeapwalkingDebuggee@15af33d6 finished successfully
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee nsk.share.jdi.SerialExecutionDebuggee : sending the command: ready
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee: received the command: COMMAND_CLEAR_DEBUGGEE
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee nsk.share.jdi.SerialExecutionDebuggee : sending the command: ready
[2009-06-07T14:02:29.37] WARNING: stop test execution because of timeout (max execution time for single test: 119101, time left: 0)
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee: received the command: quit
[2009-06-07T14:02:29.37] debugee.stderr> Debuggee: exiting
[2009-06-07T14:02:29.77] Debuggee's process finished with status: 95
[2009-06-07T14:02:29.77] TEST FAILED


Update: This entry is now being tracked by 6867612.
nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001 continues
to fail in Parallel GC, but with a different failure mode:

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/GC_Baseline-XXParallel/vm/solaris-i586/client/mixed/solaris-i586_client_mixed_nsk.quick-jdi.testlist/ResultDir/VMOutOfMemoryException001/VMOutOfMemoryException001.log

[2009-07-31T07:16:46.56] binder> Launching debugee
[2009-07-31T07:16:46.56] binder> Waiting for VM initialized
[2009-07-31T07:16:50.75] Initial VMStartEvent received: VMStartEvent in thread main
[2009-07-31T07:16:50.75] debugee.stderr> Debuggee nsk.share.jdi.AbstractJDIDebuggee : sending the command: ready
[2009-07-31T07:16:50.75] Got expected exception: com.sun.jdi.VMOutOfMemoryException
[2009-07-31T07:16:50.75] # ERROR: TEST FAILED: debuggee's process finished with status: 1
[2009-07-31T07:16:50.75] TEST FAILED


The exit status of 1 indicates an ungraceful termination due
to an out of memory condition. My search showed failures
like this on Solaris X86, Solaris AMD64, and Solaris SPARC


Update: This entry is now being tracked by 6867617.
Nightly search of Parallel GC did not find any failures for this test:

    nsk/jdi/ObjectReference/referringObjects/referringObjects003

Looks like nightly data goes back to 2008.08.01 so this failure
didn't show up outside of HSX-14-B08 PIT.

Nightly search of Parallel GC found no valid failures for this test:

    nsk/jdi/stress/serial/mixed001

but but it did find one general timeout.

nsk/jdi/stress/serial/mixed002 continues to fail in
Parallel GC, but with different failure modes:

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-03-09/GC_Baseline-XXParallel/vm/solaris-amd64/server/mixed/solaris-amd64_server_mixed_nsk.quick-jdi.testlist/ResultDir/mixed002/mixed002.log

[2009-03-10T09:41:42.43] debugee.stderr> Debuggee nsk.share.jdi.HeapwalkingDebuggee : sending the command: ready
[2009-03-10T09:41:42.43] # ERROR: Wrong value was returned  by VirtualMachine.instanceCounts(nsk.share.jdi.TestClass1): 7, expected: 50
[2009-03-10T09:41:42.43] # ERROR: Unexpected size of referenceType.instances(nsk.share.jdi.TestClass1): 7, expected: 50
[2009-03-10T09:41:42.43] # ERROR: Unexpected exception during test execution(debugger: nsk.jdi.ReferenceType.instances.instances003.instances003@27b15692): java.lang.IndexOutOfBoundsException: Index: 7, Size: 7
[2009-03-10T09:41:42.43] java.lang.IndexOutOfBoundsException: Index: 7, Size: 7
[2009-03-10T09:41:42.43] 	at java.util.ArrayList.rangeCheck(ArrayList.java:571)
[2009-03-10T09:41:42.43] 	at java.util.ArrayList.get(ArrayList.java:349)
[2009-03-10T09:41:42.43] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.testClass(instances003.java:77)
[2009-03-10T09:41:42.43] 	at nsk.jdi.ReferenceType.instances.instances003.instances003.doTest(instances003.java:105)
[2009-03-10T09:41:42.43] 	at nsk.share.jdi.SerialExecutionDebugger.executeTests(SerialExecutionDebugger.java:271)
[2009-03-10T09:41:42.43] 	at nsk.share.jdi.SerialExecutionDebugger.doTest(SerialExecutionDebugger.java:210)
[2009-03-10T09:41:42.43] 	at nsk.share.jdi.TestDebuggerType2.runIt(TestDebuggerType2.java:192)
[2009-03-10T09:41:42.43] 	at nsk.share.jdi.SerialExecutionDebugger.main(SerialExecutionDebugger.java:42)
[2009-03-10T09:41:42.43] debugee.stderr> Debuggee: received the command: quit
[2009-03-10T09:41:42.43] debugee.stderr> Debuggee: exiting


http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-02-11/GC_Baseline-XXParallel/vm/solaris-i586/server/mixed/solaris-i586_server_mixed_nsk.quick-jdi.testlist/ResultDir/mixed002/mixed002.log

[2009-02-12T10:43:51.84] debugee.stderr> Debuggee: received the command: deleteInstances:java.lang.String:50
[2009-02-12T10:43:51.84] debugee.stderr> # ERROR: Unexpected exception in debuggee: java.util.NoSuchElementException
[2009-02-12T10:43:51.84] debugee.stderr> java.util.NoSuchElementException
[2009-02-12T10:43:51.84] debugee.stderr> 	at java.util.ArrayList$Itr.next(ArrayList.java:757)
[2009-02-12T10:43:51.84] debugee.stderr> 	at nsk.share.ObjectInstancesManager.deleteAllReferrers(ObjectInstancesManager.java:102)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jdi.HeapwalkingDebuggee.parseCommand(HeapwalkingDebuggee.java:115)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jpda.AbstractDebuggeeTest.doTest(AbstractDebuggeeTest.java:284)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jdi.SerialExecutionDebuggee.parseCommand(SerialExecutionDebuggee.java:92)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jpda.AbstractDebuggeeTest.doTest(AbstractDebuggeeTest.java:284)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jpda.AbstractDebuggeeTest.doTest(AbstractDebuggeeTest.java:269)
[2009-02-12T10:43:51.96] debugee.stderr> 	at nsk.share.jdi.SerialExecutionDebuggee.main(SerialExecutionDebuggee.java:23)
[2009-02-12T10:43:51.96] debugee.stderr>


Update: The mixed002 failure in this entry is now being tracked by 6867612.
Nightly search of Parallel GC did not find any valid failures
for this test after the G1 push:

    nsk/jdi/stress/serial/heapwalking001

The one failure was an infrastructure timeout.


Nightly search of Parallel GC did not find any valid failures
for this test:

    nsk/jdi/ReferenceType/instances/instances005


Nightly search of Parallel GC did not find any failures
for these tests:

    closed/compiler/6507107/HeapwalkingTest.java
    nsk/jdi/VirtualMachine/instanceCounts/instancecounts004
The following VM/NSK test is also 
failing due to this assertion:

    nsk/hprof/regression/HeapThrasher

http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/RT_Baseline/vm/linux-i586/client/mixed/linux-i586_client_mixed_nsk.hprof.testlist/analysis.html
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/RT_Baseline/vm/solaris-i586/client/mixed/solaris-i586_client_mixed_nsk.hprof.testlist/analysis.html
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/RT_Baseline/vm/solaris-sparc/client/mixed/solaris-sparc_client_mixed_nsk.hprof.testlist/analysis.html
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/RT_Baseline/vm/solaris-sparcv9/server/mixed/solaris-sparcv9_server_mixed_nsk.hprof.testlist/analysis.html
http://sqeweb.sfbay.sun.com/nfs/tools/gtee/results/JDK7/NIGHTLY/VM/2009-07-30/RT_Baseline/vm/windows-i586/client/mixed/windows-i586_client_mixed_nsk.hprof.testlist/ResultDir/HeapThrasher/HeapThrasher.log

Comments
EVALUATION As of the fix for 6458402, the mechanism for storing object tags has been reworked and this inconsistency between the JvmtiTagMap and the JNI weak handles cannot occur. The indicated assert does not even exist anymore, so I'm closing this CR as a dup of 6458402.
11-01-2011

EVALUATION The proposed fix (and cleanup) from Keith McGuigan for 6458402 should also fix the failures reported here.
31-12-2010

EVALUATION The work on the following bug: 6862295 2/3 JDWP threadid changes during debugging session (leading to ignored breakpoints) has shown that the JVM/TI Object Tag mapping code is sensitive to incorrect full GC count values. Bug 6862295 was used to fix a full GC count issue in the GC framework collectors: -XX:+UseSerialGC, -XX:+UseConcMarkSweepGC or -Xconcgc, -Xincgc, and -XX:+UseParNewGC The bug fixed by 6862295 was introduced by the following bug fix: 6711316 4/3 Open source the Garbage-First garbage collector 6711316 was fixed in HSX-14-B06 which was integrated in JDK7-B39 and promoted on 2008.11.06. So the bug fixed by 6862295 has been around in the GC framework collectors since then. The following bug was used to fix a full GC count issue in the -XX:+UseParallelGC and -XX:+UseParallelOldGC: 6539517 3/3 CR 6186200 should be extended to perm gen allocation to prevent spurious OOM's from perm gen 6539517 was fixed in HSX-13-B01 which was integrated in JDK7-B27 and promoted on 2008.05.22. So the -XX:+UseParallelGC and -XX:+UseParallelOldGC have been issuing correct full GC counts since then. There is an open GC bug for tracking GC counter issues: 6470420 4/5 gc invocation count confusion
31-07-2009

EVALUATION The assertion suggests that a JNI weak ref has been cleared and the tag map is now out of sync. It is possible there is a code path with the new collector that is missing a JvmtiGCMarker.
25-07-2008