JDK-8278239 : vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine failed with EXCEPTION_ACCESS_VIOLATION at 0x000000000000000d
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jvmti
  • Affected Version: 18
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: windows
  • CPU: x86_64
  • Submitted: 2021-12-03
  • Updated: 2022-01-03
  • Resolved: 2021-12-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17 JDK 18 JDK 19
17.0.3-oracleFixed 18 masterFixed 19Fixed
Related Reports
Relates :  
Relates :  
Description
The following test failed in the JDK18 CI:

vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine/TestDescription.java

Here's a snippet from the log file:

Got expected exception: java.lang.reflect.InvocationTargetException
Got expected exception: java.lang.reflect.InvocationTargetException
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000000000000000d, pid=43520, tid=48028
#
# JRE version: Java(TM) SE Runtime Environment (18.0+26) (fastdebug build 18-ea+26-1792)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 18-ea+26-1792, mixed mode, sharing, tiered, compressed class ptrs, z gc, windows-amd64)
# Problematic frame:
# C  0x000000000000000d
#
# Core dump will be written. Default location: T:\\testoutput\\test-support\\jtreg_open_test_hotspot_jtreg_vmTestbase_nsk_jvmti_quick\\scratch\\0\\hs_err_pid43520.mdmp
#
# An error report file with more information is saved as:
# T:\\testoutput\\test-support\\jtreg_open_test_hotspot_jtreg_vmTestbase_nsk_jvmti_quick\\scratch\\0\\hs_err_pid43520.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
s() is successfully done
>>>>>>>> Invoke RedefineClasses():
	new class byte count=2287
t:\\workspace\\open\\test\\hotspot\\jtreg\\vmTestbase\\nsk\\jvmti\\RedefineClasses\\StressRedefine\\stressRedefine.cpp: Failed to call RedefineClasses():
	the function returned error 60: JVMTI_ERROR_INVALID_CLASS_FORMAT
	For more info about this error see the JVMTI spec.
>>>>>>>> Invoke RedefineClasses():
	new class byte count=2287
<<<<<<<< RedefineClasses() is successfully done
>>>>>>>> Invoke RedefineClasses():
	new class byte count=2287
<<<<<<<< RedefineClasses() is successfully done
>>>>>>>> Invoke RedefineClasses():
	new class byte count=2287
<<<<<<<< RedefineClasses() is successfully done
>>>>>>>> Invoke RedefineClasses():
	new class byte count=2287
<<<<<<<< RedefineClasses() is successfully done
t:\\workspace\\open\\test\\hotspot\\jtreg\\vmTestbase\\nsk\\jvmti\\RedefineClasses\\StressRedefine\\stressRedefine.cpp: Failed to call RedefineClasses():
	the function returned error 60: JVMTI_ERROR_INVALID_CLASS_FORMAT
	For more info about this error see the JVMTI spec.
<<<<<<<< RedefineClasses() is successfully done
<<<<<<<< RedefineClasses() is successfully done
----------System.err:(11/3315)*----------


Here's the crashing thread's stack:

---------------  T H R E A D  ---------------

Current thread (0x00000275aeaa4e60):  WorkerThread "ZWorker#0" [stack: 0x00000093d3500000,0x00000093d3600000] [id=48028]

Stack: [0x00000093d3500000,0x00000093d3600000],  sp=0x00000093d35ff4a8,  free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  0x000000000000000d


siginfo: EXCEPTION_ACCESS_VIOLATION (0xc0000005), data execution prevention violation at address 0x000000000000000d

That looks like a bad jump or call to 0x000000000000000d. Ouch!

There are a couple of other unresolved bugs for this same test:

JDK-8269633 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine/TestDescription.java crashed with oopDesc::size_given_klass+0x6

JDK-8274154 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefine/TestDescription.java timed out
Comments
Changeset: 2be3e7ef Author: Coleen Phillimore <coleenp@openjdk.org> Date: 2021-12-22 17:19:00 +0000 URL: https://git.openjdk.java.net/jdk18/commit/2be3e7ef1cff1aae6faf1f4f0545d561af48d0ba
22-12-2021

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk18/pull/63 Date: 2021-12-21 20:51:04 +0000
21-12-2021

This isn't a bug in JDK 11 or below. Also the code is very different from the current code, but it's still not a bug on JDK 11. It's a bug on JDK 17 and should be backported there.
21-12-2021

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/6900 Date: 2021-12-20 22:59:39 +0000
20-12-2021

This is a bug that happens when RedefineClasses runs after mark_end but before ZGC has unlinked the nmethods. VM_RedefineClasses::doit checks all nmethods that are alive, but not those that are unloading. See: - MetadataOnStackMark -> CodeCache::metadata_do - flush_dependent_code -> mark_all_nmethods_for_evol_deoptimization - (probably more) at the end of doit() we then call ClassLoaderDataGraph::clean_deallocate_lists(false);. Since metadata in unloading nmethods were not visited, they will be deallocated by that call! After this, the ZGC nmethod unlinking code will walk these deallocated metadata and crash here: void ZUnload::unlink() { ... ZNMethod::unlink(_workers, unloading_occurred); I think we need to reevaluate what nmethods are visited by RedefineClasses.
17-12-2021

This can be a ZGC or Compiler bug. I moving this to GC team for evaluation. Feel free to move it back if you think it is not a ZGC issue.
17-12-2021

Just to add that I realised that we have the nmethod at 0x00000275be08a810 name as it is referenced in the hs_err: R8 =0x00000275be08b170 is at entry_point+1904 in (nmethod*)0x00000275be08a810 Compiled method (c2) 33467 2603 4 java.lang.invoke.LambdaForm$MH/0x0000000800d62000::invoke (86 bytes) ... dependencies [0x00000275be08b170,0x00000275be08b198] = 40 Where that dependencies data is: 0x275be08b170 : 0x1101040103010201 0x275be08b178 : 0xa01090106010501 0x275be08b180 : 0xe010d010c010b01 0x275be08b188 : 0x1007101507020f01 0x275be08b190 : 0x0 0x275be08b198 : 0xa000000001
16-12-2021

Kevin Walls helped by extracting more info with hs_err_pid43520.mdmp. Below are extra details from Kevin. The failing stack trace is: 000000000000000d() Unknown > jvm.dll!Dependencies::DepStream::argument(int i) Line 945 C++ jvm.dll!Dependencies::DepStream::argument(int i) Line 941 C++ jvm.dll!Dependencies::DepStream::context_type() Line 973 C++ jvm.dll!nmethod::flush_dependencies(bool delete_immediately) Line 1580 C++ [Inline Frame] jvm.dll!ZNMethodUnlinkClosure::_unlink(nmethod *) Line 293 C++ jvm.dll!ZNMethodUnlinkClosure::do_nmethod(nmethod * nm) Line 325 C++ jvm.dll!ZNMethodTableIteration::nmethods_do(NMethodClosure * cl) Line 69 C++ jvm.dll!ZNMethodUnlinkTask::work() Line 368 C++ jvm.dll!ZTask::Task::work(unsigned int worker_id) Line 36 C++ [Inline Frame] jvm.dll!WorkerTaskDispatcher::worker_run_task() Line 67 C++ jvm.dll!WorkerThread::run() Line 159 C++ jvm.dll!Thread::call_run() Line 363 C++ jvm.dll!thread_native_entry(Thread * thread) Line 544 C++ the method we crash in: Metadata* Dependencies::DepStream::argument(int i) { Metadata* result = recorded_metadata_at(argument_index(i)); if (result == NULL) { // Explicit context argument can be compressed int ctxkj = dep_context_arg(type()); // -1 if no explicit context arg if (ctxkj >= 0 && i == ctxkj && ctxkj+1 < argument_count()) { 941 result = ctxk_encoded_as_null(type(), argument(ctxkj+1)); } } 945 assert(result == NULL || result->is_klass() || result->is_method(), "must be"); <---- crash/wild jump in this line... return result; } This disassembly of that last line makes sense, it says we have a bad "result" pointer. 945: assert(result == NULL || result->is_klass() || result->is_method(), "must be"); 00007FFED31D8600 48 85 C0 test rax,rax 00007FFED31D8603 74 3E je Dependencies::DepStream::argument+123h (07FFED31D8643h) null check 00007FFED31D8605 48 8B 13 mov rdx,qword ptr [rbx] 00007FFED31D8608 48 8B CB mov rcx,rbx 00007FFED31D860B FF 52 08 call qword ptr [rdx+8] ...bad jump for result->is_klass() so likely a bad result value. 00007FFED31D860E 84 C0 test al,al Source of bad jump loaded from RBX 00000277C8736760 ..which does contain 0xd which we jump to. The nmethod being processed earlier in the stack is 0x00000275be08a810 I can dump the fields of this in the SA, but no obvious name. I thought that was in constmethoddata, but that field does not work in this nmethod... At the crash, rax and rbx both hold 00000277C8736760 ..which is the result of: result = ctxk_encoded_as_null(type(), argument(ctxkj+1)); ..and is readable data, but not a valid nmethod.
16-12-2021

Made it P3 as it can be a regression.
09-12-2021

The test intentionally redefines with the corrupted bytecodes, so RedefineClasses returns errors: JVMTI_ERROR_INVALID_CLASS_FORMAT and JVMTI_ERROR_FAILS_VERIFICATION However, these errors normally should not cause crashes. The crash happens on a ZGC thread, so at least, it seems that the ZGC is missing some assert. The hs_err dump has no stack trace, so it is not easy to understand the context of the crash. I wonder if we need a fix on ZGC side.
09-12-2021

There are several JVMTI errors in the log like below (I'm not sure if they are expected): >>>>>>>> Invoke RedefineClasses(): new class byte count=2287 t:\\workspace\\open\\test\\hotspot\\jtreg\\vmTestbase\\nsk\\jvmti\\RedefineClasses\\StressRedefine\\stressRedefine.cpp: Failed to call RedefineClasses(): the function returned error 60: JVMTI_ERROR_INVALID_CLASS_FORMAT For more info about this error see the JVMTI spec. >>>>>>>> Invoke RedefineClasses(): new class byte count=2287 t:\\workspace\\open\\test\\hotspot\\jtreg\\vmTestbase\\nsk\\jvmti\\RedefineClasses\\StressRedefine\\stressRedefine.cpp: Failed to call RedefineClasses(): the function returned error 62: JVMTI_ERROR_FAILS_VERIFICATION For more info about this error see the JVMTI spec. The hs_err dump does not have a stack trace. Current thread is: Current thread (0x00000275aeaa4e60): WorkerThread "ZWorker#0" [stack: 0x00000093d3500000,0x00000093d3600000] [id=48028] This can be a ZGC problem.
07-12-2021