United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-8004903 : VMThread::execute() calls Thread::check_for_valid_safepoint_state() on concurrent VM ops

Submit Date:
Updated Date:
Project Name:
Resolved Date:
Affected Versions:
Fixed Versions:
hs25 (b17)

Related Reports

Sub Tasks

During Karen's testing of the changes for 6607129, she ran into a crash with the following stack trace snippet:

  [5] VMError::report_and_die(this = 0xe587e654), line 891 in "vmError.cpp"
  [6] report_vm_error(file = 0xfef8c84f "/java/east/u2/kk1971/hssyncperf/src/share/vm/runtime/thread.cpp", line = 761, error_msg = 0xfecef355 "fatal error", detail_msg = 0xfef8c88f "Possible safepoint reached by thread that does not allow it"), line 176 in "debug.cpp"
  [7] report_fatal(file = 0xfef8c84f "/java/east/u2/kk1971/hssyncperf/src/share/vm/runtime/thread.cpp", line = 761, message = 0xfef8c88f "Possible safepoint reached by thread that does not allow it"), line 181 in "debug.cpp"
=>[8] Thread::check_for_valid_safepoint_state(this = 0x832400, potential_vm_operation = true), line 761 in "thread.cpp"
  [9] VMThread::execute(op = 0x80b318), line 537 in "vmThread.cpp"
  [10] InduceScavenge(Self = 0x832400, Whence = 0xfef791b7 "omAlloc"), line 1146 in "synchronizer.cpp"
  [11] ObjectSynchronizer::omAlloc(Self = 0x832400), line 1234 in "synchronizer.cpp"
  [12] ObjectSynchronizer::inflate(Self = 0x832400, object = 0xf8612270), line 1492 in "synchronizer.cpp"
  [13] ObjectSynchronizer::inflate_helper(obj = 0xf8612270), line 1426 in "synchronizer.cpp"
  [14] BasicLock::move_to(this = 0xe587f720, obj = 0xf8612270, dest = 0x71d4e0), line 62 in "basicLock.cpp"
  [15] vframeArrayElement::fill_in(this = 0x71d32c, vf = 0x5e8d20), line 69 in "vframeArray.cpp"
  [16] vframeArray::fill_in(this = 0x71d030, thread = 0x832400, frame_size = 28, chunk = 0x5e8cd8, reg_map = 0xe587f378), line 442 in "vframeArray.cpp"
  [17] vframeArray::allocate(thread = 0x832400, frame_size = 28, chunk = 0x5e8cd8, reg_map = 0xe587f378, sender = CLASS, caller = CLASS, self = CLASS), line 430 in "vframeArray.cpp"
  [18] Deoptimization::create_vframeArray(thread = 0x832400, fr = CLASS, reg_map = 0xe587f378, chunk = 0x5e8cd8), line 938 in "deoptimization.cpp"
  [19] Deoptimization::fetch_unroll_info_helper(thread = 0x832400), line 225 in "deoptimization.cpp"
  [20] Deoptimization::uncommon_trap(thread = 0x832400, trap_request = -34), line 1595 in "deoptimization.cpp"

Analysis of the above stack trace shows that VMThread::execute() is
calling Thread::check_for_valid_safepoint_state() on concurrent VM
ops (Mode == _concurrent || Mode == _async_safepoint). That may
not be a correct check to make.


To reproduce this failure mode run CallTimerGrid with following options:

/work/local/jdk/1.8.0/solaris-x64/bin/java \
    -server-fast \
    -ms256m -mx256m \
    -showversion \
    -XX:MonitorBound=1 \
    -XX:+DeoptimizeALot \
java version "1.8.0-ea"
Java(TM) SE Runtime Environment (build 1.8.0-ea-b71)
Java HotSpot(TM) Server VM (build 25.0-b14-fastdebug, mixed mode)

0.0 => NoSynchronization 1T 1/4S 8I 0Lm 100TO 0Pm 0Cm 0B 256C 2Xp 4096Ib : 1.1
1.0 => PublicSynchronization 1T 1/4S 8I 0Lm 100TO 0Pm 0Cm 0B 256C 2Xp 4096Ib : 2
2.0 => NestedSynchronization 1T 1/4S 8I 0Lm 100TO 0Pm 0Cm 0B 256C 2Xp 4096Ib : #
 To suppress the following error report, specify this argument
# after -XX: or in .hotspotrc:  SuppressErrorAt=/thread.cpp:911
# A fatal error has been detected by the Java Runtime Environment:
#  Internal Error (/tmp/workspace/jdk8-2-build-solaris-i586-product/jdk8/hotspot
/src/share/vm/runtime/thread.cpp:911), pid=18789, tid=14
#  fatal error: Possible safepoint reached by thread that does not allow it
# JRE version: Java(TM) SE Runtime Environment (8.0-b71) (build 1.8.0-ea-b71)
# Java VM: Java HotSpot(TM) Server VM (25.0-b14-fastdebug mixed mode solaris-x86
# Core dump written. Default location: /work/shared/bugs/8004903/CallTimerGrid/c
ore or core.18789
# An error report file with more information is saved as:
# /work/shared/bugs/8004903/CallTimerGrid/hs_err_pid18789.log
Phoning home...
Using server:, port 4711
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
Current thread is 14
Dumping core ...
run-calltimer-8004903.sh[7]: 18789 Abort(coredump)

The "-server-fast" option is so that the fastdebug version of the Server VM
can be used with the product bits JDK.
The "-XX:MonitorBound=1" option is so that InduceScavenge() is call
as soon as possible.
The "-XX:+DeoptimizeALot" option gets a deoptimize happening which
happened in Karen's original bug sighting.

Here's a stack trace snippet from my hs_err_pid file:

---------------  T H R E A D  ---------------

Current thread (0x083a8000):  JavaThread "Thread-1" [_thread_in_Java, id=14, sta

Stack: [0xe43af000,0xe43ff000],  sp=0xe43fc5e0,  free space=309k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1db4c39]  void VMError::report(outputStream*)+0x92d
V  [libjvm.so+0x1db5ebe]  void VMError::report_and_die()+0x57a
V  [libjvm.so+0xb5816f]  void report_fatal(const char*,int,const char*)+0x56f
V  [libjvm.so+0x1c74266]  void Thread::check_for_valid_safepoint_state(bool)+0x1
V  [libjvm.so+0x1de2ea6]  void VMThread::execute(VM_Operation*)+0x1f6
V  [libjvm.so+0x1bf2333]  ObjectMonitor*ObjectSynchronizer::inflate(Thread*,oop)
V  [libjvm.so+0x1bf1bcf]  ObjectMonitor*ObjectSynchronizer::inflate_helper(oop)+
V  [libjvm.so+0x5e978f]  void BasicLock::move_to(oop,BasicLock*)+0x53
V  [libjvm.so+0x1da3539]  void vframeArrayElement::fill_in(compiledVFrame*)+0x3a
V  [libjvm.so+0x1da5497]  vframeArray*vframeArray::allocate(JavaThread*,int,Grow
V  [libjvm.so+0xbc003c]  vframeArray*Deoptimization::create_vframeArray(JavaThre
V  [libjvm.so+0xbb28f4]  Deoptimization::UnrollBlock*Deoptimization::fetch_unrol
V  [libjvm.so+0xbb1533]  Deoptimization::UnrollBlock*Deoptimization::fetch_unrol
v  ~DeoptimizationBlob
J  java.lang.StringBuffer.append(Ljava/lang/String;)Ljava/lang/StringBuffer;
j  EDU.oswego.cs.dl.util.concurrent.misc.SynchronizationTimer$OneTest.run()V+720
j  EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run()V+15
j  java.lang.Thread.run()V+11
j  EDU.oswego.cs.dl.util.concurrent.misc.Threads$MyThread.run()V+8
v  ~StubRoutines::call_stub
V  [libjvm.so+0x1024427]  void JavaCalls::call_helper(JavaValue*,methodHandle*,J
V  [libjvm.so+0x18be40c]  void os::os_exception_wrapper(void(*)(JavaValue*,metho
V  [libjvm.so+0x102319b]  void JavaCalls::call(JavaValue*,methodHandle,JavaCallA
V  [libjvm.so+0x102112b]  void JavaCalls::call_virtual(JavaValue*,KlassHandle,Sy
V  [libjvm.so+0x102172a]  void JavaCalls::call_virtual(JavaValue*,Handle,KlassHa
V  [libjvm.so+0x129c0c8]  void thread_entry(JavaThread*,Thread*)+0xb0
V  [libjvm.so+0x1c7e539]  void JavaThread::thread_main_inner()+0x4e1
V  [libjvm.so+0x1c7dd51]  void JavaThread::run()+0x845
V  [libjvm.so+0x18ad485]  java_start+0x1d1
C  [libc.so.1+0xa71d0]  _thr_setup+0x4e
C  [libc.so.1+0xa74c0]  __moddi3+0x60

See https://jbs.oracle.com/bugs/browse/JDK-8004902 for the webrevs used during Code Review Round [012] for this fix.
Here's a snippet of an e-mail thread with some useful info:

Backing upto see what

    Thread::check_for_valid_safepoint_state(bool potential_vm_operation)

is intended to do:

- checks to see if there is a No_Safepoint_Verifier object on
  the stack: (!(_allow_safepoint_count == 0))
- checks to see if the calling JavaThread is in the right state:
  (is_Java_thread() && ((JavaThread*)this)->thread_state() != _thread_in_vm)
- if the caller is a JavaThread and potential_vm_operation is true,
  then we have to make sure the JavaThread isn't holding any locks
  that would deadlock the VMThread
- handle GCALotAtAllSafepoints

All of the above checks make sense for blocking VM operations where
Mode == _safepoint. The requesting JavaThread is planning to wait
around synchronously until the operation is done.

For non-blocking VM operations where (Mode == _concurrent or
Mode == _async_safepoint), let's examine/reason about each check:

- No_Safepoint_Verifier is meant to guard code where we're doing
  something that a safepoint can mess up (oop manipulation, etc).
  Submitting a non-blocking VM operation isn't going to goto a
  safepoint. The JavaThread submitting the non-blocking VM op
  is going to return and the non-blocking VM op is going to get
  handled sometime later.

- "calling JavaThread is in the right state (_thread_in_vm)" is
  meant to make sure that the JavaThread submitting the VM op
  is in a safepoint-able state. Submitting a non-blocking VM
  operation isn't going to goto a safepoint so we don't need to
  make sure that the JavaThread is in state _thread_in_vm.

- "JavaThread isn't holding locks that can block the VMThread"
  is meant to make sure that the blocked JavaThread isn't going
  to deadlock the VMThread. Submitting a non-blocking VM operation
  isn't going to block so we don't need to check locks.

- When we are on this branch in VMThread::execute(), we create a
  SkipGCALot helper object so GCALotAtAllSafepoints is already

OK. It took me a while, but I'm now convinced that non-blocking VM
operations don't need to make that check_for_valid_safepoint_state()
call. Of course, I still need to move the code from where Karen had
to address the other code review comments that I had, but I think
we're good to go on this one. 
Yes, restricting when the check should be made is the direction that Karen's experimental fix from a couple of years ago went. I'm doing the due diligence necessary to bring the fix forward to HSX-25.
_allow_safepoint_count is manipulated by the gcLocker. So it seems that the thread is marked as not safepoint "safe" but then triggers a safepointing VM operation. It may be that this should only be checked for non-concurrent safepoint ops.

Hardware and Software, Engineered to Work Together