JDK-8248851 : CMS: Missing memory fences between free chunk check and klass read
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version:
    8u251,openjdk8u272,11.0.7-oracle,11.0.9,13.0.5 8u251,openjdk8u272,11.0.7-oracle,11.0.9,13.0.5
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: aarch64
  • Submitted: 2020-07-06
  • Updated: 2020-08-13
  • Resolved: 2020-07-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 Other
11.0.10-oracleFixed 13.0.5Fixed openjdk8u272Fixed
Description
We were witnessing random JVM crash that triggers in one of our production environment.
We were using an aarch64 jdk8u release build with -XX:+UseConcMarkSweepGC.
We see three different crash logs [1][2][3].

Debugging show that this caused by missing memory fences for systems with weak memory model like aarch64.
For the first crash log, we found that it's possible on aarch64 that the klass load may be scheduled before the free chunk check in CompactibleFreeListSpace::block_size().
Then we may have an invalid non-null klass, which leads to the crash.
Same issue exists in CompactibleFreeListSpace::block_is_obj(), which leads to crash log [2] & [3].
Proposed fix for jdk8u:

diff -r 93cfec0cf417 src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp
--- a/src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp   Sat Jul 04 00:02:00 2020 +0200
+++ b/src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp   Mon Jul 06 21:36:06 2020 +0800
@@ -994,6 +994,11 @@
         return res;
       }
     } else {
+      // Bugfix for systems with weak memory model (AARCH64).
+      // Acquire to make sure that the klass read happens after the free
+      // chunk check.
+      OrderAccess::acquire();
+
       // must read from what 'p' points to in each loop.
       Klass* k = ((volatile oopDesc*)p)->klass_or_null();
       if (k != NULL) {
@@ -1049,6 +1054,11 @@
         return res;
       }
     } else {
+      // Bugfix for systems with weak memory model (AARCH64).
+      // Acquire to make sure that the klass read happens after the free
+      // chunk check.
+      OrderAccess::acquire();
+
       // must read from what 'p' points to in each loop.
       Klass* k = ((volatile oopDesc*)p)->klass_or_null();
       // We trust the size of any object that has a non-NULL
@@ -1111,6 +1121,12 @@
   // assert(CollectedHeap::use_parallel_gc_threads() || _bt.block_start(p) == p,
   //        "Should be a block boundary");
   if (FreeChunk::indicatesFreeChunk(p)) return false;
+
+  // Bugfix for systems with weak memory model (AARCH64).
+  // Acquire to make sure that the klass read happens after the free
+  // chunk check.
+  OrderAccess::acquire();
+
   Klass* k = oop(p)->klass_or_null();
   if (k != NULL) {
     // Ignore mark word because it may have been used to

 [1].
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000ffffb2f320e8, pid=49265, tid=0x0000ffffb16a41e0
#
# JRE version: OpenJDK Runtime Environment (8.0_222-b10) (build 1.8.0_222)
# Java VM: OpenJDK 64-Bit Server VM (25.222-b10 mixed mode linux-aarch64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x4650e8]  CompactibleFreeListSpace::block_size(HeapWord const*) const+0x110
#
# Core dump written. Default location: /home/vsbo/vsbo_container/modules/vsbo/logs/core or core.49265
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Stack: [0x0000ffffb14a5000,0x0000ffffb16a5000],  sp=0x0000ffffb16a3450,  free space=2041k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x4650e8]  CompactibleFreeListSpace::block_size(HeapWord const*) const+0x110
V  [libjvm.so+0x2e6984]  BlockOffsetArrayNonContigSpace::block_start_unsafe(void const*) const+0xcc
V  [libjvm.so+0x959494]  CardTableModRefBS::process_stride(Space*, MemRegion, int, int, OopsInGenClosure*, CardTableRS*, signed char**, unsigned long, unsigned long)+0x214
V  [libjvm.so+0x95991c]  CardTableModRefBS::non_clean_card_iterate_parallel_work(Space*, MemRegion, OopsInGenClosure*, CardTableRS*, int)+0xbc
V  [libjvm.so+0x3bff44]  CardTableModRefBS::non_clean_card_iterate_possibly_parallel(Space*, MemRegion, OopsInGenClosure*, CardTableRS*)+0x54
V  [libjvm.so+0x3c01c0]  CardTableRS::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x70
V  [libjvm.so+0x4afb68]  ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x58
V  [libjvm.so+0x5df528]  GenCollectedHeap::gen_process_roots(int, bool, bool, GenCollectedHeap::ScanningOption, bool, OopsInGenClosure*, OopsInGenClosure*, CLDClosure*)+0xf8
V  [libjvm.so+0x95c54c]  ParNewGenTask::work(unsigned int)+0x144
V  [libjvm.so+0xba78b8]  GangWorker::loop()+0xe8
V  [libjvm.so+0x937414]  java_start(Thread*)+0x11c
C  [libpthread.so.0+0x78bc]  start_thread+0x19c

[2].
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000ffff7fd607a8, pid=28547, tid=0x0000ffff0c7e31e0
#
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-aarch64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x4707a8]  FreeListSpace_DCTOC::walk_mem_region_with_cl_par(MemRegion, HeapWord*, HeapWord*, FilteringClosure*)+0x170
#
Stack: [0x0000ffff0c5e4000,0x0000ffff0c7e4000],  sp=0x0000ffff0c7e2250,  free space=2040k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x4707a8]  FreeListSpace_DCTOC::walk_mem_region_with_cl_par(MemRegion, HeapWord*, HeapWord*, FilteringClosure*)+0x170
V  [libjvm.so+0x470d64]  FreeListSpace_DCTOC::walk_mem_region_with_cl(MemRegion, HeapWord*, HeapWord*, FilteringClosure*)+0x6c
V  [libjvm.so+0xab5d5c]  Filtering_DCTOC::walk_mem_region(MemRegion, HeapWord*, HeapWord*)+0x2f4
V  [libjvm.so+0xab31fc]  DirtyCardToOopClosure::do_MemRegion(MemRegion)+0x104
V  [libjvm.so+0x3c9860]  ClearNoncleanCardWrapper::do_MemRegion(MemRegion)+0xe8
V  [libjvm.so+0x960fcc]  CardTableModRefBS::process_stride(Space*, MemRegion, int, int, OopsInGenClosure*, CardTableRS*, signed char**, unsigned long, unsigned long)+0x1bc
V  [libjvm.so+0x9614ac]  CardTableModRefBS::non_clean_card_iterate_parallel_work(Space*, MemRegion, OopsInGenClosure*, CardTableRS*, int)+0xbc
V  [libjvm.so+0x3c943c]  CardTableModRefBS::non_clean_card_iterate_possibly_parallel(Space*, MemRegion, OopsInGenClosure*, CardTableRS*)+0x54
V  [libjvm.so+0x3c96b8]  CardTableRS::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x70
V  [libjvm.so+0x4b8858]  ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x58
V  [libjvm.so+0x5e94d0]  GenCollectedHeap::gen_process_roots(int, bool, bool, GenCollectedHeap::ScanningOption, bool, OopsInGenClosure*, OopsInGenClosure*, CLDClosure*)+0xf8
V  [libjvm.so+0x96422c]  ParNewGenTask::work(unsigned int)+0x144
V  [libjvm.so+0xbc7af0]  GangWorker::loop()+0xe8
V  [libjvm.so+0x93e6ec]  java_start(Thread*)+0x11c
C  [libpthread.so.0+0x78bc]  start_thread+0x19c

[3].
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000ffffa07c82bc, pid=34768, tid=0x0000ffff2aff71e0
#
# JRE version: OpenJDK Runtime Environment (8.0_252-b09) (build 1.8.0_252)
# Java VM: OpenJDK 64-Bit Server VM (25.252-b09 mixed mode linux-aarch64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0xab92bc]  DirtyCardToOopClosure::get_actual_top(HeapWord*, HeapWord*)+0xcc
#
Stack: [0x0000ffff2adf8000,0x0000ffff2aff8000],  sp=0x0000ffff2aff63a0,  free space=2040k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xab92bc]  DirtyCardToOopClosure::get_actual_top(HeapWord*, HeapWord*)+0xcc
V  [libjvm.so+0xab97ac]  DirtyCardToOopClosure::do_MemRegion(MemRegion)+0xb4
V  [libjvm.so+0x3cba90]  ClearNoncleanCardWrapper::do_MemRegion(MemRegion)+0xe8
V  [libjvm.so+0x96615c]  CardTableModRefBS::process_stride(Space*, MemRegion, int, int, OopsInGenClosure*, CardTableRS*, signed char**, unsigned long, unsigned long)+0x1bc
V  [libjvm.so+0x96663c]  CardTableModRefBS::non_clean_card_iterate_parallel_work(Space*, MemRegion, OopsInGenClosure*, CardTableRS*, int)+0xbc
V  [libjvm.so+0x3cb66c]  CardTableModRefBS::non_clean_card_iterate_possibly_parallel(Space*, MemRegion, OopsInGenClosure*, CardTableRS*)+0x54
V  [libjvm.so+0x3cb8e8]  CardTableRS::younger_refs_in_space_iterate(Space*, OopsInGenClosure*)+0x70
V  [libjvm.so+0x4ba980]  ConcurrentMarkSweepGeneration::younger_refs_iterate(OopsInGenClosure*)+0x58
V  [libjvm.so+0x5ec2d0]  GenCollectedHeap::gen_process_roots(int, bool, bool, GenCollectedHeap::ScanningOption, bool, OopsInGenClosure*, OopsInGenClosure*, CLDClosure*)+0xf8
V  [libjvm.so+0x9693ac]  ParNewGenTask::work(unsigned int)+0x144
V  [libjvm.so+0xbcd698]  GangWorker::loop()+0xe8
V  [libjvm.so+0x9438c4]  java_start(Thread*)+0x11c
C  [libpthread.so.0+0x78bc]  start_thread+0x19c
Comments
Thanks [~yan] for approving this. I will push later.
30-07-2020

Hi [~fyang], please push to jdk13u-dev! Thank you.
30-07-2020

URL: https://hg.openjdk.java.net/jdk-updates/jdk11u/rev/ae52898b6f0d User: clanger Date: 2020-07-29 08:06:34 +0000
29-07-2020

URL: https://hg.openjdk.java.net/jdk-updates/jdk11u-dev/rev/ae52898b6f0d User: fyang Date: 2020-07-21 14:40:41 +0000
21-07-2020

Hi [~clanger], Thanks for approving this. Will push this to 11u-dev: http://cr.openjdk.java.net/~fyang/8248851-11u/webrev.01/
21-07-2020

Hi [~fyang], I've just approved this for 11u and also changed the version to 11-pool. When you push to 11u-dev with JDK-8248851 in the commit message, this but should get resolved. Further pushes of the changeset should trigger backport items.
21-07-2020

Fix Request [13u] Reviewing thread: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-July/012079.html http://cr.openjdk.java.net/~fyang/8248851-13u/webrev.01/
17-07-2020

Fix Request [11u] Reviewing thread: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-July/012079.html http://cr.openjdk.java.net/~fyang/8248851-11u/webrev.01/
17-07-2020

Fix Request [8u] Reviewing thread: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-July/012079.html Webrev: http://cr.openjdk.java.net/~fyang/8248851-8u/webrev.01/
17-07-2020