JDK-8254874 : ZGC: JNIHandleBlock verification failure in stack watermark processing
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 16
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2020-10-16
  • Updated: 2024-12-09
  • Resolved: 2020-10-19
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 16
11.0.17-oracleFixed 16 b21Fixed
Related Reports
Relates :  
Description
#  Internal Error (/home/stefank/git/alt2/open/src/hotspot/share/gc/z/zVerify.cpp:328), pid=2724460, tid=2724527
#  assert(!ZAddress::is_good(ZOop::to_address(o))) failed: Should not be good: 0x00001000010222a8

V  [libjvm.so+0x1a5181b]  ZVerifyBadOopClosure::do_oop(oop*)+0xeb
V  [libjvm.so+0xfd3418]  JNIHandleBlock::oops_do(OopClosure*)+0x58
V  [libjvm.so+0x184ef9c]  JavaThread::oops_do_no_frames(OopClosure*, CodeBlobClosure*)+0x3c
V  [libjvm.so+0x1a50a12]  ZVerify::verify_thread_head_bad(JavaThread*)+0x22
V  [libjvm.so+0x1a3ca38]  ZStackWatermark::start_processing_impl(void*)+0x28
V  [libjvm.so+0x171eece]  StackWatermark::start_processing()+0x5e
V  [libjvm.so+0x16a9851]  SafepointMechanism::process_if_requested_slow(JavaThread*)+0x31
V  [libjvm.so+0x1208446]  JvmtiRawMonitor::simple_wait(Thread*, long)+0x896
V  [libjvm.so+0x1208adf]  JvmtiRawMonitor::raw_wait(long, Thread*)+0x6f
V  [libjvm.so+0x11cd87a]  JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*, long)+0x3a
C  [libcm03t001.so+0x952d]  rawMonitorWait+0xd
C  [libcm03t001.so+0x9688]  nsk_jvmti_waitForSync+0x68
C  [libcm03t001.so+0xd9c7]  agentProc+0x2157
C  [libcm03t001.so+0x9491]  agentThreadWrapper+0x91
V  [libjvm.so+0x1200b94]  JvmtiAgentThread::call_start_function()+0x1d4

Running:
while makec ../build/fastdebug/ test-only TEST=vmTestbase/nsk/jvmti/scenarios/capability/CM03/cm03t001/TestDescription.java JTREG="JAVA_OPTIONS=-XX:+UseZGC -XX:+ClassUnloading -Xmx2g -XX:ZCollectionInterval=1 -XX:ZFragmentationLimit=0.01" JTREG_EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt; do : ; done

With patch to make ZCollectionInterval measured in ms instead of seconds:
diff --git a/src/hotspot/share/gc/z/zDirector.cpp b/src/hotspot/share/gc/z/zDirector.cpp
index 345d202e063..8ebc24eaa74 100644
--- a/src/hotspot/share/gc/z/zDirector.cpp
+++ b/src/hotspot/share/gc/z/zDirector.cpp
@@ -56,9 +56,9 @@ bool ZDirector::rule_timer() const {
 
   // Perform GC if timer has expired.
   const double time_since_last_gc = ZStatCycle::time_since_last();
-  const double time_until_gc = ZCollectionInterval - time_since_last_gc;
+  const double time_until_gc = double(ZCollectionInterval) / 1000 - time_since_last_gc;
 
-  log_debug(gc, director)("Rule: Timer, Interval: %us, TimeUntilGC: %.3fs",
+  log_debug(gc, director)("Rule: Timer, Interval: %ums, TimeUntilGC: %.3fs",
                           ZCollectionInterval, time_until_gc);
 
   return time_until_gc <= 0;

Comments
A pull request was submitted for review. URL: https://git.openjdk.org/jdk11u-dev/pull/1433 Date: 2022-10-11 07:52:49 +0000
11-10-2022

Fix request [11u] I backport this for parity with 11.0.18-oracle. No risk, only a test change. Clean backport. SAP nighlty testing passed.
11-10-2022

Changeset: 672f5669 Author: Stefan Karlsson <stefank@openjdk.org> Date: 2020-10-19 07:22:29 +0000 URL: https://git.openjdk.java.net/jdk/commit/672f5669
19-10-2020

This is most likely a test bug. The test creates a local JNI handle in it's prepare function, it then uses that handle in callbacks from other threads. This is invalid use of local handles according to: https://docs.oracle.com/en/java/javase/15/docs/specs/jni/design.html#global-and-local-references The problem we encounter is that we apply a load barrier to the handle and self-heals another thread's local handle. When that thread later starts stack processing, it fails the pre-verification test that all oops should be bad.
16-10-2020