JDK-6475157 : RMIConnectorServer.stop: deadlock
  • Type: Bug
  • Component: core-svc
  • Sub-Component: javax.management
  • Affected Version: 5.0
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2006-09-26
  • Updated: 2010-12-03
  • Resolved: 2006-12-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
5.0u12Fixed 6u1 b01Fixed 7Fixed
Description
The deadlock happens when:
1) close the last client;
2) stop the server immediately.

Here the client report:
<###@###.###>
While running test cases against the server, I encountered a deadlock that prevented the tests from completing.  From the stack trace, it looks like it's in the RMI communication performed by JMX.  This needs to be looked into more closely because it has the potential to hang our tests (including anything that runs them, like daily builds), as well as obviously causing problems for the server itself.

Neil


2006-09-24 14:42:35
Full thread dump Java HotSpot(TM) Client VM (1.6.0-rc-b99 mixed mode, sharing):

...............

Found one Java-level deadlock:
=============================
"RMI Unreferenced-0":
  waiting to lock monitor 0x08099b88 (object 0x9ff0d648, a java.util.ArrayList),
  which is held by "main"
"main":
  waiting to lock monitor 0x08099bec (object 0x9ff162b8, a javax.management.remote.rmi.RMIConnectionImpl),
  which is held by "RMI Unreferenced-0"

Java stack information for the threads listed above:
===================================================
"RMI Unreferenced-0":
	at javax.management.remote.rmi.RMIServerImpl.clientClosed(RMIServerImpl.java:324)
	- waiting to lock <0x9ff0d648> (a java.util.ArrayList)
	at javax.management.remote.rmi.RMIConnectionImpl.close(RMIConnectionImpl.java:182)
	- locked <0x9ff162b8> (a javax.management.remote.rmi.RMIConnectionImpl)
	at javax.management.remote.rmi.RMIConnectionImpl.unreferenced(RMIConnectionImpl.java:190)
	at sun.rmi.transport.Target$1.run(Target.java:310)
	at java.lang.Thread.run(Thread.java:619)
"main":
	at javax.management.remote.rmi.RMIConnectionImpl.close(RMIConnectionImpl.java:162)
	- waiting to lock <0x9ff162b8> (a javax.management.remote.rmi.RMIConnectionImpl)
	at javax.management.remote.rmi.RMIServerImpl.close(RMIServerImpl.java:411)
	- locked <0x9ff0d648> (a java.util.ArrayList)
	- locked <0x9fef8cc8> (a org.opends.server.protocols.jmx.OpendsRMIJRMPServerImpl)
	at javax.management.remote.rmi.RMIConnectorServer.stop(RMIConnectorServer.java:528)
	at org.opends.server.protocols.jmx.RmiConnector.finalizeConnectionHandler(RmiConnector.java:433)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.finalizeConnectionHandler(JmxConnectionHandler.java:508)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.applyNewConfiguration(JmxConnectionHandler.java:798)
	at org.opends.server.protocols.jmx.JmxConnectionHandler.applyNewConfiguration(JmxConnectionHandler.java:771)
	at org.opends.server.protocols.jmx.JmxConnectTest.configureJmx(JmxConnectTest.java:375)
	at org.opends.server.protocols.jmx.JmxConnectTest.sslConnect(JmxConnectTest.java:336)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.testng.internal.MethodHelper.invokeMethod(MethodHelper.java:552)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:411)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:785)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:114)
	at org.testng.TestRunner.privateRun(TestRunner.java:695)
	at org.testng.TestRunner.run(TestRunner.java:574)
	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:241)
	at org.testng.SuiteRunner.run(SuiteRunner.java:145)
	at org.testng.TestNG.createAndRunSuiteRunners(TestNG.java:901)
	at org.testng.TestNG.runSuitesLocally(TestNG.java:863)
	at org.testng.TestNG.run(TestNG.java:613)
	at org.testng.TestNG.privateMain(TestNG.java:1001)
	at org.testng.TestNG.main(TestNG.java:938)

Found 1 deadlock.

Comments
EVALUATION The deadlock can be reproduced in another way that leads to a valid regression test. The idea is to subclass RMIJRMPServerImpl in order to be able to override its clientClosed method. Within that method, it can create another thread that will call connectorServer.stop(), and wait for that thread to complete or block before calling super.clientClosed. The test is triggered by creating and closing a connection. If the bug is not fixed, this will lead to the following deadlock: Initial thread: RMIConnectionImpl.close - locks RMIConnectionImpl -> RMIJRMPServerImpl.clientClosed (called from overriding method) - tries to lock clientList Created thread: RMIConnectorServer.stop -> RMIJRMPServerImpl.close - locks clientList -> RMIConnectionImpl.close - tries to lock RMIConnectionImpl With the fix (move the call to clientClosed out of the synchronized block), the test passes.
27-09-2006

EVALUATION The deadlock can occur if an individual RMI connection is closed at the same time as the RMI connector server is closed. An individual connection can be closed either explicitly by the client, or implicitly because the client has gone away and the Distributed Garbage Collection discovers that. The latter seems less likely but is in fact what we see in the stack trace. To reproduce this failure, we could subclass RMIJRMPConnectionImpl and override its close() method like this: @Override public synchronized void close() throws IOException { if (!alreadyCalled) { alreadyCalled = true; Thread t = new Thread() { public void run() { connectorServer.stop(); } }; t.start(); Thread.sleep(1000); } super.close(); } (This will need some try/catches to compile.) Then the test can open a single connection and close it again (with JMXConnector.close()), which should produce the deadlock. The deadlock will look like this: Original thread: RMIConnectionImplSubclass.close - locks RMIConnectionImpl - connectorServer.stop thread created here -> RMIConnectionImpl.close - tries to lock clientList Created thread: RMIConnectorServer.stop -> RMIServerImpl.close - locks clientList -> RMIConnectionImpl.close - tries to lock RMIConnectionImpl I think the simplest fix is to change RMIConnectionImpl.close so that it is no longer synchronized. Instead, a synchronized(this) block should surround the body of the statement but the final call to rmiServer.clientClosed should be outside the block. Unfortunately, the modified code above will still fail, because we are artificially extending the synchronization of RMIConnectionImpl.close to cover the entire method again. It may be possible to reproduce the deadlock in another way that would not have this problem (so that we can have a regression test).
26-09-2006