JDK-8288936 : Wrong lock ordering writing G1HeapRegionTypeChange JFR event
  • Type: Bug
  • Component: hotspot
  • Sub-Component: jfr
  • Affected Version: 11,17,21,22
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2022-06-22
  • Updated: 2024-04-04
  • Resolved: 2023-08-09
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 22
21.0.4-oracleFixed 22 b10Fixed
Related Reports
Relates :  
Description
When writing a G1HeapRegionTypeChange JFR event the following assertion can be triggered:
assert(false) failed: Attempting to acquire lock JfrMsg_lock/nosafepoint-3 out of order with lock FreeList_lock/service-1 -- possible deadlock

The associated stack trace is on Windows looks like this:
V  [jvm.dll+0xe486d3]  os::platform_print_native_stack+0x63  (os_windows_x86.cpp:235)
V  [jvm.dll+0x1140ae8]  VMError::report+0x1238  (vmError.cpp:835)
V  [jvm.dll+0x114334e]  VMError::report_and_die+0x6ce  (vmError.cpp:1683)
V  [jvm.dll+0x11439b1]  VMError::report_and_die+0x81  (vmError.cpp:1468)
V  [jvm.dll+0x60a39d]  report_vm_error+0x9d  (debug.cpp:284)
V  [jvm.dll+0xdd7ba7]  Mutex::check_rank+0x427  (mutex.cpp:444)
V  [jvm.dll+0xdd81e4]  Mutex::try_lock_inner+0x44  (mutex.cpp:170)
V  [jvm.dll+0xdd8025]  Mutex::try_lock+0x15  (mutex.cpp:182)
V  [jvm.dll+0x9de60f]  JfrMonitorTryLock::JfrMonitorTryLock+0x3f  (jfrTryLock.hpp:59)
V  [jvm.dll+0x9ddfb6]  JfrPostBox::asynchronous_post+0x66  (jfrPostBox.cpp:107)
V  [jvm.dll+0x9dddb1]  JfrPostBox::post+0x61  (jfrPostBox.cpp:82)
V  [jvm.dll+0x9e8a1d]  JfrStorage::register_full+0xed  (jfrStorage.cpp:284)
V  [jvm.dll+0x9ed8fb]  JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0>::register_full+0x2b  (jfrMemorySpace.inline.hpp:301)
V  [jvm.dll+0x9ea228]  JfrMspaceRetrieval<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >::acquire<StopOnNullCondition<JfrLinkedList<JfrBuffer,JfrCHeapObj> > >+0x108  (jfrMemorySpaceRetrieval.hpp:59)
V  [jvm.dll+0x9ebfe7]  JfrMspaceRetrieval<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >::acquire+0x87  (jfrMemorySpaceRetrieval.hpp:43)
V  [jvm.dll+0x9ebeaa]  JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0>::acquire+0x3a  (jfrMemorySpace.inline.hpp:241)
V  [jvm.dll+0x9eae78]  mspace_acquire_live<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >+0x38  (jfrMemorySpace.inline.hpp:422)
V  [jvm.dll+0x9eaf11]  mspace_acquire_live_with_retry<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >+0x91  (jfrMemorySpace.inline.hpp:428)
V  [jvm.dll+0x9e9388]  acquire_promotion_buffer+0x78  (jfrStorage.cpp:178)
V  [jvm.dll+0x9e7788]  JfrStorage::flush_regular_buffer+0x148  (jfrStorage.cpp:244)
V  [jvm.dll+0x9e78d8]  JfrStorage::flush_regular+0x58  (jfrStorage.cpp:400)
V  [jvm.dll+0x9e8d49]  JfrStorage::flush+0xe9  (jfrStorage.cpp:388)
V  [jvm.dll+0x9ab9e2]  JfrFlush::JfrFlush+0x52  (jfrFlush.cpp:35)
V  [jvm.dll+0x2d2e83]  Adapter<JfrFlush>::flush+0x73  (jfrStorageAdapter.hpp:84)
V  [jvm.dll+0x2d224e]  StorageHost<Adapter<JfrFlush>,StackObj>::accommodate+0x2e  (jfrStorageHost.inline.hpp:66)
V  [jvm.dll+0x2d2c13]  WriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> >::ensure_size+0x53  (jfrWriterHost.inline.hpp:169)
V  [jvm.dll+0x2d18c2]  WriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> >::write<unsigned __int64>+0x92  (jfrWriterHost.inline.hpp:77)
V  [jvm.dll+0x2d1923]  WriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> >::write<unsigned __int64>+0x23  (jfrWriterHost.inline.hpp:182)
V  [jvm.dll+0x925733]  EventG1HeapRegionTypeChange::writeData<EventWriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> > >+0x33  (jfrEventClasses.hpp:5368)
V  [jvm.dll+0x925ff8]  JfrEvent<EventG1HeapRegionTypeChange>::write_sized_event+0x148  (jfrEvent.hpp:250)
V  [jvm.dll+0x925e57]  JfrEvent<EventG1HeapRegionTypeChange>::write_event+0xa7  (jfrEvent.hpp:219)
V  [jvm.dll+0x925968]  JfrEvent<EventG1HeapRegionTypeChange>::commit+0x68  (jfrEvent.hpp:93)
V  [jvm.dll+0x9256d4]  HeapRegionTracer::send_region_type_change+0xb4  (heapRegionTracer.cpp:43)
V  [jvm.dll+0x913c0b]  HeapRegion::report_region_type_change+0x5b  (heapRegion.cpp:273)
V  [jvm.dll+0x914248]  HeapRegion::set_survivor+0x18  (heapRegion.cpp:170)
V  [jvm.dll+0x80f62f]  G1CollectedHeap::new_gc_alloc_region+0xef  (g1CollectedHeap.cpp:3273)
V  [jvm.dll+0x7e8024]  G1GCAllocRegion::allocate_new_region+0x84  (g1AllocRegion.cpp:351)
V  [jvm.dll+0x7e6bf2]  G1AllocRegion::new_alloc_region_and_allocate+0x172  (g1AllocRegion.cpp:139)
V  [jvm.dll+0x7ebb0c]  G1AllocRegion::attempt_allocation_using_new_region+0x3c  (g1AllocRegion.inline.hpp:106)
V  [jvm.dll+0x7ebabe]  G1AllocRegion::attempt_allocation_locked+0x5e  (g1AllocRegion.inline.hpp:100)
V  [jvm.dll+0x7e9073]  G1Allocator::survivor_attempt_allocation+0xe3  (g1Allocator.cpp:255)
V  [jvm.dll+0x7e9bd7]  G1Allocator::par_allocate_during_gc+0x57  (g1Allocator.cpp:230)
V  [jvm.dll+0x7e9afc]  G1Allocator::par_allocate_during_gc+0x4c  (g1Allocator.cpp:216)
V  [jvm.dll+0x7ea32e]  G1PLABAllocator::allocate_direct_or_new_plab+0x22e  (g1Allocator.cpp:362)
V  [jvm.dll+0x86f0a2]  G1ParScanThreadState::allocate_copy_slow+0x72  (g1ParScanThreadState.cpp:403)
V  [jvm.dll+0x86f361]  G1ParScanThreadState::do_copy_to_survivor_space+0x151  (g1ParScanThreadState.cpp:468)
V  [jvm.dll+0x870ce7]  G1ParScanThreadState::do_oop_evac<enum narrowOop>+0x147  (g1ParScanThreadState.cpp:211)
V  [jvm.dll+0x86f815]  G1ParScanThreadState::dispatch_task+0x45  (g1ParScanThreadState.cpp:289)
V  [jvm.dll+0x86fbfa]  G1ParScanThreadState::trim_queue_to_threshold+0x8a  (g1ParScanThreadState.cpp:311)
V  [jvm.dll+0x86c7fe]  G1ParScanThreadState::trim_queue_partially+0x3e  (g1ParScanThreadState.inline.hpp:54)
V  [jvm.dll+0x8965b9]  G1ScanHRForRegionClosure::scan_memregion+0x149  (g1RemSet.cpp:797)
V  [jvm.dll+0x893120]  G1ScanHRForRegionClosure::do_claimed_block+0x1b0  (g1RemSet.cpp:817)
V  [jvm.dll+0x892e24]  G1ScanHRForRegionClosure::do_card_block+0x54  (g1RemSet.cpp:825)
V  [jvm.dll+0x8963a6]  G1ScanHRForRegionClosure::scan_heap_roots+0x216  (g1RemSet.cpp:857)
V  [jvm.dll+0x893e27]  G1ScanHRForRegionClosure::do_heap_region+0xe7  (g1RemSet.cpp:902)
V  [jvm.dll+0x894e5e]  G1RemSetScanState::iterate_dirty_regions_from+0xae  (g1RemSet.cpp:420)
V  [jvm.dll+0x87fce6]  G1RemSet::scan_heap_roots+0x76  (g1RemSet.cpp:923)
V  [jvm.dll+0x8ad25c]  G1EvacuateRegionsTask::scan_roots+0x6c  (g1YoungCollector.cpp:667)
V  [jvm.dll+0x8ad5c5]  G1EvacuateRegionsBaseTask::work+0x95  (g1YoungCollector.cpp:653)
V  [jvm.dll+0x118cf46]  WorkerTaskDispatcher::worker_run_task+0x76  (workerThread.cpp:72)
V  [jvm.dll+0x118d4a0]  WorkerThread::run+0x30  (workerThread.cpp:164)
V  [jvm.dll+0x10a374c]  Thread::call_run+0x15c  (thread.cpp:365)
V  [jvm.dll+0xe40db8]  thread_native_entry+0x108  (os_windows.cpp:545)
C  [ucrtbase.dll+0x21bb2]
C  [KERNEL32.DLL+0x17034]
C  [ntdll.dll+0x52651]

The problem can be reproduced with a debug version of the VM with the following command line:
java -XX:StartFlightRecording=settings=g1-heap-region-change.jfc,flush-interval=1h -Xmx1G -Xms128M  -XX:+UseG1GC examples.Allocator

with the following settings for the g1-heap-region-change.jfc file:
<?xml version="1.0" encoding="UTF-8"?>
<configuration version="2.0" label="GC History Simulation" description="Should include events anaolg to our GC history file." provider="Oracle">

  <event name="jdk.G1HeapRegionTypeChange">
    <setting name="enabled">true</setting>
  </event>

</configuration>

and this Allocator code:
package examples;

public class Allocator {

	public static void main(String[] args) {
		long maxMem = 1024 * 1024 * 768;
		long minMem = 1024 * 1024 * 64;
		int maxOverwrite = 24;
		int chunkSize = 1024 * 10;
		byte[][] roots = new byte[1 + (int) (maxMem / chunkSize)][];
		long used = 0;
		int pos = 0;

		while (true) {
			int overwriteCount = (int) (Math.random() * maxOverwrite);

			while (used < maxMem) {
				for (int i = 0; i <= overwriteCount; ++i) {
					roots[pos] = new byte[chunkSize];
				}
				
				pos += 1;
				used += chunkSize;
			}
			
			long targetUse = minMem + (long) ((maxMem - minMem) * Math.random());

			while (used > targetUse) {
				pos -= 1;
				roots[pos] = null;
				used -= chunkSize;
			}
		}
	}
}

Comments
Fix request [21u] I backport this for parity with 21.0.4-oracle. Low to medium risk. Fiddles with locking, but only for jfr. Also it is well hung, but no test. Clean backport. SAP nightly testing passed.
04-04-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk21u-dev/pull/437 Date: 2024-04-02 20:56:00 +0000
02-04-2024

It looks like the lock ordering problem with the FreeList_lock relative to the JfrMsg_lock goes back to JDK11. Back then, they were both ranked "leaf".
10-08-2023

Which JDK versions it affects? The last change to JfrMsg_lock ordering I see is JDK-8266936.
10-08-2023

Changeset: 0eb0997a Author: Markus Grönlund <mgronlun@openjdk.org> Date: 2023-08-09 13:34:04 +0000 URL: https://git.openjdk.org/jdk/commit/0eb0997ae4f81314b764241e69dae5c698dbb6c6
09-08-2023

Also ZGC has run into lock ordering problems: Stack: [0x0000009347e00000,0x0000009347f00000] Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0xc548d1] os::win32::platform_print_native_stack+0x101 (os_windows_x86.cpp:236) V [jvm.dll+0xef257c] VMError::report+0x146c (vmError.cpp:991) V [jvm.dll+0xef4975] VMError::report_and_die+0x645 (vmError.cpp:1797) V [jvm.dll+0xef5084] VMError::report_and_die+0x64 (vmError.cpp:1562) V [jvm.dll+0x53eabb] report_vm_error+0x5b (debug.cpp:191) V [jvm.dll+0xbf031d] Mutex::check_rank+0x22d (mutex.cpp:447) V [jvm.dll+0xbf0f9f] Mutex::try_lock+0x2f (mutex.cpp:182) V [jvm.dll+0x82e5eb] JfrPostBox::post+0xab (jfrPostBox.cpp:81) V [jvm.dll+0x839296] JfrStorage::register_full+0x126 (jfrStorage.cpp:282) V [jvm.dll+0x835891] JfrMspaceRetrieval<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >::acquire<StopOnNullCondition<JfrLinkedList<JfrBuffer,JfrCHeapObj> > >+0xc1 (jfrMemorySpaceRetrieval.hpp:59) V [jvm.dll+0x835e00] mspace_acquire_live_with_retry<JfrMemorySpace<JfrStorage,JfrMspaceRetrieval,JfrLinkedList<JfrBuffer,JfrCHeapObj>,JfrLinkedList<JfrBuffer,JfrCHeapObj>,0> >+0x90 (jfrMemorySpace.inline.hpp:434) V [jvm.dll+0x8380e4] JfrStorage::flush_regular_buffer+0xe4 (jfrStorage.cpp:244) V [jvm.dll+0x837e3e] JfrStorage::flush_regular+0x8e (jfrStorage.cpp:400) V [jvm.dll+0x837cf5] JfrStorage::flush+0x195 (jfrStorage.cpp:388) V [jvm.dll+0x800fe4] JfrFlush::JfrFlush+0x34 (jfrFlush.cpp:35) V [jvm.dll+0x22af99] Adapter<JfrFlush>::flush+0x49 (jfrStorageAdapter.hpp:84) V [jvm.dll+0x22a761] StorageHost<Adapter<JfrFlush>,StackObj>::accommodate+0x31 (jfrStorageHost.inline.hpp:66) V [jvm.dll+0x22aecb] WriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> >::ensure_size+0x3b (jfrWriterHost.inline.hpp:171) V [jvm.dll+0x22a6d6] WriterHost<EncoderHost<BigEndianEncoderImpl,BigEndianEncoderImpl>,EncoderHost<Varint128EncoderImpl,BigEndianEncoderImpl>,MemoryWriterHost<Adapter<JfrFlush>,StackObj,ExclusiveAccessAssert> >::write<__int64>+0x26 (jfrWriterHost.inline.hpp:183) V [jvm.dll+0xf739b6] JfrEvent<EventZPageAllocation>::write_sized_event+0x206 (jfrEvent.hpp:241) V [jvm.dll+0xf73356] JfrEvent<EventZPageAllocation>::write_event+0x166 (jfrEvent.hpp:217) V [jvm.dll+0xf7149f] XPageAllocator::alloc_page+0x60f (xPageAllocator.cpp:697) V [jvm.dll+0xf532d7] XHeap::alloc_page+0x27 (xHeap.cpp:174) V [jvm.dll+0xf6f98e] XObjectAllocator::alloc_page+0x4e (xObjectAllocator.cpp:72) V [jvm.dll+0xf6f82c] XObjectAllocator::alloc_object_in_shared_page+0x6c (xObjectAllocator.cpp:102) V [jvm.dll+0xf6f4b5] XObjectAllocator::alloc_object+0x65 (xObjectAllocator.cpp:170) V [jvm.dll+0xf6f6de] XObjectAllocator::alloc_object_for_relocation+0x2e (xObjectAllocator.cpp:181) V [jvm.dll+0xf7c0f1] relocate_object_inner+0xd1 (xRelocate.cpp:68) V [jvm.dll+0xf7bf95] XRelocate::relocate_object+0x55 (xRelocate.cpp:100) V [jvm.dll+0xf45456] XBarrier::relocate+0xc6 (xBarrier.cpp:111) V [jvm.dll+0xf4555c] XBarrier::relocate_or_mark+0x2c (xBarrier.cpp:115) V [jvm.dll+0xf6ced3] XNMethodOopClosure::do_oop+0x253 (xNMethod.cpp:247) V [jvm.dll+0xf6d55a] XNMethod::nmethod_oops_do_inner+0xaa (xNMethod.cpp:228) V [jvm.dll+0xf6d414] XNMethod::nmethod_oops_barrier+0x24 (xNMethod.cpp:259) V [jvm.dll+0xf4cc41] XBarrierSetNMethod::nmethod_entry_barrier+0xa1 (xBarrierSetNMethod.cpp:63) V [jvm.dll+0xf80568] XOnStackCodeBlobClosure::do_code_blob+0x38 (xStackWatermark.cpp:43) V [jvm.dll+0x62c2e8] frame::oops_code_blob_do+0x1b8 (frame.cpp:984) V [jvm.dll+0xf80817] XStackWatermark::process+0x67 (xStackWatermark.cpp:99) V [jvm.dll+0xd69435] StackWatermarkFramesIterator::process_one+0x165 (stackWatermark.cpp:104) V [jvm.dll+0xd69838] StackWatermark::start_processing_impl+0xd8 (stackWatermark.cpp:218) V [jvm.dll+0xd68c3c] StackWatermark::on_safepoint+0x5c (stackWatermark.cpp:313) V [jvm.dll+0xd17db6] SafepointMechanism::process+0x56 (safepointMechanism.cpp:159) V [jvm.dll+0x31f866] SafepointMechanism::process_if_requested_with_exit_check+0x36 (safepointMechanism.inline.hpp:89) V [jvm.dll+0x7de327] JavaThread::check_special_condition_for_native_trans+0x67 (javaThread.cpp:1234)
04-08-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/15155 Date: 2023-08-04 12:17:27 +0000
04-08-2023