United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6610420 Debug VM crashes during monitor lock rank checking
JDK-6610420 : Debug VM crashes during monitor lock rank checking

Details
Type:
Bug
Submit Date:
2007-09-27
Status:
Closed
Updated Date:
2011-03-31
Project Name:
JDK
Resolved Date:
2011-03-08
Component:
hotspot
OS:
generic
Sub-Component:
runtime
CPU:
generic
Priority:
P4
Resolution:
Fixed
Affected Versions:
7
Fixed Versions:
hs12 (b02)

Related Reports
Backport:
Backport:

Sub Tasks

Description
Tom Rodriguez encountered a VM crash when he ran SPECjbb with "-XX:+PrintOptoAssembly" flag. The crash looks like below:

#
# An unexpected error has been detected by Java Runtime Environment:
#
#  Internal Error (/net/smite/export/ws/box/src/share/vm/runtime/mutex.cpp:1290), pid=20019, tid=9
#  Error: acquiring lock UNKNOWN/4 out of order with lock UNKNOWN/0 -- possible deadlock
#
# Java VM: Java HotSpot(TM) Server VM (11.0-b07-1.7.0-never-box-jvmg2-jvmg mixed mode solaris-sparc)
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

The first odd thing is that the locks don't know their names.  Looking at the core, none of the locks have their _name field set to anything but UNKNOWN.  The name is passed into the various constructors and just dropped on the floor.  I can't see how this changed from the sccs but I thought we used to get lock names.

As far as the crash it turns out UNKNOWN/4 is the SerializePage_lock and UKNOWN/0 is the tty_lock.

Here's the full call stack:

  [1] __lwp_kill(0x0, 0x6, 0x0, 0xff33c000, 0x0, 0x0), at 0xff320218
  [2] raise(0x6, 0x0, 0x9f1fd090, 0x7efefeff, 0x81010100, 0xff00), at 0xff2d0c80
  [3] abort(0x9f1fd138, 0xfeef4e99, 0xfeef4e96, 0x9, 0x23, 0xfefec24f), at 0xff2b6e98
  [4] os::abort(dump_core = true), line 1767 in "os_solaris.cpp"
  [5] VMError::report_and_die(this = 0x9f1fd318), line 825 in "vmError.cpp"
  [6] report_fatal(file_name = 0xfeed2c08 "/net/smite/export/ws/box/src/share/vm/runtime/mutex.cpp", line_no = 1290, message = 0x9f1fd3b4 "acquiring lock UNKNOWN/4 out of order with lock UNKNOWN/0 -- possible deadlock"), line 182 in "debug.cpp"
  [7] report_fatal_vararg(file_name = 0xfeed2c08 "/net/smite/export/ws/box/src/share/vm/runtime/mutex.cpp", line_no = 1290, format = 0xfeed2c40 "acquiring lock %s/%d out of order with lock %s/%d -- possible deadlock", ...), line 191 in "debug.cpp"
=>[8] Monitor::set_owner_implementation(this = 0x45650, new_owner = 0x152c00), line 1290 in "mutex.cpp"
  [9] Monitor::set_owner(this = 0x45650, owner = 0x152c00), line 231 in "mutex.hpp"
  [10] Monitor::lock_without_safepoint_check(this = 0x45650, Self = 0x152c00), line 932 in "mutex.cpp"
  [11] Monitor::lock_without_safepoint_check(this = 0x45650), line 936 in "mutex.cpp"
  [12] os::block_on_serialize_page_trap(), line 985 in "os.cpp"
  [13] JVM_handle_solaris_signal(sig = 11, info = 0x9f1fdd20, ucVoid = 0x9f1fda68, abort_if_unrecognized = 1), line 499 in "os_solaris_sparc.cpp"
  [14] signalHandler(sig = 11, info = 0x9f1fdd20, ucVoid = 0x9f1fda68), line 3942 in "os_solaris.cpp"
  [15] __sighndlr(0xb, 0x9f1fdd20, 0x9f1fda68, 0xfe99cbf8, 0x0, 0x0), at 0xff3956c8
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [16] os::write_memory_serialize_page(thread = 0x152c00), line 242 in "os.hpp"
  [17] InterfaceSupport::serialize_memory(thread = 0x152c00), line 31 in "interfaceSupport_solaris.hpp"
  [18] ThreadStateTransition::transition_and_fence(thread = 0x152c00, from = _thread_in_vm, to = _thread_in_native), line 151 in "interfaceSupport.hpp"
  [19] ThreadStateTransition::trans_and_fence(this = 0x9f1fe018, from = _thread_in_vm, to = _thread_in_native), line 204 in "interfaceSupport.hpp"
  [20] ThreadInVMfromNative::~ThreadInVMfromNative(this = 0x9f1fe018), line 253 in "interfaceSupport.hpp"
  [21] ciObject::print_oop(this = 0x4f5540), line 216 in "ciObject.cpp"
  [22] Compile::Fill_buffer(this = 0x9f1ff360), line 1366 in "output.cpp"
  [23] Compile::Output(this = 0x9f1ff360), line 139 in "output.cpp"
  [24] Compile::Code_Gen(this = 0x9f1ff360), line 1636 in "compile.cpp"
  [25] Compile::Compile(this = 0x9f1ff360, ci_env = 0x9f1ff910, compiler = 0x140380, target = 0x4f5540, osr_bci = -1, subsume_loads = true), line 608 in "compile.cpp"
  [26] C2Compiler::compile_method(this = 0x140380, env = 0x9f1ff910, target = 0x4f5540, entry_bci = -1), line 109 in "c2compiler.cpp"
  [27] CompileBroker::invoke_compiler_on_method(task = 0x454940), line 1540 in "compileBroker.cpp"
  [28] CompileBroker::compiler_thread_loop(), line 1392 in "compileBroker.cpp"
  [29] compiler_thread_entry(thread = 0x152c00, __the_thread__ = 0x152c00), line 2713 in "thread.cpp"
  [30] JavaThread::thread_main_inner(this = 0x152c00), line 1381 in "thread.cpp"
  [31] JavaThread::run(this = 0x152c00), line 1365 in "thread.cpp"
  [32] java_start(thread_addr = 0x152c00), line 1013 in "os_solaris.cpp"

The cause of this crash is if the Java thread already owns tty_lock and tries to grab the SerializePage_lock during serialize page trap, the monitor rank checking will fail since SerializePage_lock has a higher rank then tty_lock.

                                    

Comments
EVALUATION

One possible fix is to speical case the rank checking code, but I am afraid it will slow down the rank checking and even more complicate the special conditions we've already had. Another option is to define the rank of SerializePage_lock as "native" priority, that will diable the rank checking, but it seems odd to do so.Probably a better clean fix is to use raw lock methods such as Thread::muxAcquire/Release for SerializePage_lock.

Another thing we should fix is to initialize the lock name field inside both Monitor/Mutex constructors.
                                     
2007-09-27



Hardware and Software, Engineered to Work Together