JDK-8046339 : sun.rmi.transport.DGCAckHandler leaks memory
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.rmi
  • Affected Version: 6u33
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2014-06-09
  • Updated: 2016-08-26
  • Resolved: 2016-02-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 JDK 8 JDK 9
6u121Fixed 7u101Fixed 8u102Fixed 9 b106Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
Instances of class sun.rmi.transport.DGCAckHandler accumulate and eventually 
cause OutOfMemoryError. Memory leak is suspected.  
 

Comments
To elaborate on my comment above, what I think this bug needs is 1) an evaluation of the DGC system and an analysis of this fix, and 2) a regression test. The purpose of 1) is to show that the fix is correct. Note that a regression test is not evidence of this; it's merely evidence that the change does what it's intended to do, not that the change is correct. Point 1) shouldn't be onerous, but it does require that someone spend some time to get up to speed with how DGC works. 2) Can probably be written as a whitebox test that sets up some short timeouts, waits a bit, and then reflects on the idTable to ensure that the reference has been removed. It might be somewhat tricky to force the code into the timeout path, since the DGC acks are handled automatically at a fairly low level of the system. One way is to forcibly close the connection (again, using reflection to get at RMI internals). An alternative approach might be to fork a JVM in a subprocess and simply forcibly exit it at the right time.
04-08-2015

The fix above needs to be verified and a regression test written.
09-01-2015

Here's a patch of the modified DGCAckHandler.java file that Darryl Mocek mentioned (see previous comment). The previous changeset was 00cd9dc3c2b5. The diagnosis and fix seem sensible but I haven't verified it. Also, this needs a regression test. --- DGCAckHandler.00cd9dc3c2b5.java 2014-06-09 15:37:50.000000000 -0700 +++ DGCAckHandler.dmocek.2011-12-30.java 2014-06-09 15:30:44.000000000 -0700 @@ -118,6 +118,9 @@ if (objList != null && task == null) { task = scheduler.schedule(new Runnable() { public void run() { + if (id != null) { + idTable.remove(id); + } release(); } }, dgcAckTimeout, TimeUnit.MILLISECONDS); @@ -140,6 +143,9 @@ * release its references. **/ public static void received(UID id) { + System.out.println("DGCAckHandler.received, about to print call stack."); + new Throwable().printStackTrace(); + System.out.println("DGCAckHandler.received, after printing call stack."); DGCAckHandler h = idTable.remove(id); if (h != null) { h.release();
09-06-2014

Previously reported over two years ago by a CAP member against 6u29 (see JDK-7116204). We should probably fix this one. See that bug for further information in the Description and Comments. I've closed the other bug as a duplicate of this one.
09-06-2014

Comment in email from Darryl Mocek, 2011-12-30: A DGCAckHandler (attached) gets created by a ConnectionOutputStream with a UID and placed into the static DGCAckHandler's idTable HashMap. When the ConnectionOutputStream's done method is called, the DGCAckHandler's startTimer method is called. This starts a 5 minute timer for the DGCAck to be received and schedules a Runnable for releasing the reference to the DGCAckHandler if the DGCAck isn't received. In TCPTransport's (also attached), handleMessages method, if the transport operation is a DGCAck, then DGCAckHandler.received is called and it's removed from idTable. However, if DGCAckHandler.received isn't called and the timer expires, the task is cancelled but the reference in the idTable is never removed. This causes idTable to continue to grow. The fix, which is in the attached DGCAckHandler, is to add idTable.remove(id) to the Runnable created in startTimer. I have verified this works by commenting out the implementation of received, simulating that it is never called and causing the timer to expire. What I need to do is to create a test which will cause the received method to never be called, causing the timer to expire and the reference to be removed via the Runnable. I haven't been able to get this to happen yet. A customer has reported this issue and I have requested more information on the configuration of their systems in the hopes of duplicating their problem, however I haven't received anything yet. Any suggestions on how to prevent the DGCAck from being called would help.
09-06-2014

Possible duplicate of JDK-7116204. Not sure which issue to close out as a duplicate of which; maybe close JDK-7116204 as a duplicate of this one, since this one is a shadow bug. On the other hand, this bug is confidential and JDK-7116204 is open. This bug probably applies to all current JDK releases, including 6u, 7u, 8u, and 9.
09-06-2014