JDK-4403367 : (perf) generational garbage collector non-functional under RMI
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.rmi
  • Affected Version: 1.3.0
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: generic
  • CPU: generic
  • Submitted: 2001-01-10
  • Updated: 2007-05-11
  • Resolved: 2007-05-11
Related Reports
Duplicate :  
Relates :  
Description
Name: boT120536			Date: 01/09/2001


[cporter@src]$ java -server -version
java version "1.3.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0_01)
Java HotSpot(TM) Server VM (build 1.3.0_01, mixed mode)



1) Run an RMI server application that maintains MANY objects, a fullGC is run
every minute, thereby neutralizing all advantges of the generational garbage
collector. Run the following example and see full GC hits of 4-6 seconds every
minute, even though NO objects are being created or destroyed.  In our much
larger real-world code we get GC hits of up to TWENTY SECONDS PER MINUTE, with
NO object creation.

In sun/rmi/transport/ObjectTable.java , there is a call to GC.requestLatency
(gcInterval) setting a default period for a forced GC to 60 seconds. If
successful operation of RMI depends on this foced GC, which makes the
generational garbage collector worthless, RMI is itself useless for large
servers.



SAMPLE CODE
break up the following text into files
 ####################################### is a file delimeter in this message
add . to you classpath
run the script do.sh
after aprox ten minutes on a 512 MB Linux box the program will emit a line
saying 1000000 objects added to hashtable in <n> seconds.  After that there is
no activity in the RMI server. However, every minute a full GC will run for 4
to 6 seconds, collecting NO garbage

####################################################
# do.sh
#compile
javac TestHashServer.java ITestHash.java
rmic TestHashServer

#register the server ( java.policy below )
rmiregistry -J-Xmx100M -J-Djava.security.policy=./java.policy &

#run
java -server -verbose:gc -XX:NewSize=1M -XX:MaxNewSize=32M -Xms400M -Xmx500M
TestHashServer &

#################################################################
// ITestHash.java
import java.rmi.Remote;
import java.rmi.RemoteException;


/**
 */

public interface ITestHash
    extends Remote {

    public String getString(String key)
 throws RemoteException ;

    public void addString(String key,
     String val)
 throws RemoteException;

    public void removeString(String key)
 throws RemoteException;
}


#######################################################################
//TestHashServer.java

import java.rmi.*;
import java.rmi.server.*;
import java.rmi.registry.*;
import java.util.*;

public class TestHashServer
    extends UnicastRemoteObject
    implements ITestHash {

    /**
     * the contained data structure that the RmiServer provides
     * controlled access.
     */
    private Hashtable hash;


    /**
     * the ctor does nothing except pass in the service name
     * (i.e. the name by which it is known to the rmiregistry)
     * to the parent class.
     */
    public TestHashServer(String serviceName)
 throws RemoteException {

 hash = new Hashtable();
    }

    /**
     * a wrapper around the hashtable's method.
     */
    public String getString(String key)
 throws RemoteException {

 return (String )hash.get(key);
    }

    /**
     * another wrapper, this time around put()
     */
    public void addString(String key,
     String val)
 throws RemoteException {

 hash.put(key,val);
    }

    /**
     * another wrapper, this time around remove()
     */
    public void removeString(String key) {
 hash.remove(key);
    }

    /**
     * public gettor for constant service name.
     * this is the JavaPope(tm) approved method
     * of getting constants.
     */
    public static String getTestServiceName() {
 return "testHashService";
    }

 public void register()
 throws RemoteException, AlreadyBoundException {

  String serviceName = getTestServiceName();
     // register with registry
     LocateRegistry.getRegistry().bind(serviceName, this);
     System.out.println("RmiServer "
                          + serviceName
                          + " registered");
 }

 /**
     */
    public static void main(String[] args)
 throws RemoteException, NotBoundException,
java.net.MalformedURLException {

 try{
  // crate the server object
  TestHashServer server = new TestHashServer
(getTestServiceName());
  // register with RMI
  server.register();


  ITestHash server2 = (ITestHash )Naming.lookup
(getTestServiceName());
  int count = 1000000;
  int stringSize = 50;
  long startTime = System.currentTimeMillis();
  for( int i = 0; i < count; i++){

   StringBuffer b = new StringBuffer(stringSize);
   for( int  j = 0; j < stringSize; j++){
    b.append( j);
   }
   String s = b.toString();
   server2.addString( new Integer(i).toString(), s );
  }
  long time = System.currentTimeMillis() - startTime;
  System.out.println( "hash filled with "+ count + " int
arrays in " + time + "ms");
  Object semaphore = new Object();
  while( true )
  {
   synchronized( semaphore ){
    semaphore.wait( 60 * 1000 );
   }
   System.out.print(" . ");
  }
 }
 catch( Exception e){
  e.printStackTrace();
 }
 }
}

###########################################################################
//java.policy
grant {
    permission java.security.AllPermission;
};
#########################################################################
(Review ID: 114860) 
======================================================================

Comments
EVALUATION While we still hope for something like 4665536 to provide a more ultimate conceptual solution to this issue, the change of RFE 6200091 in JDK 6 has largely taken care of the practical issue, of full GCs every minute when using RMI "out of the box", so I am closing this CR as dup of 6200091.
11-05-2007

EVALUATION We hope that the problems of this bug report will be largely amelioriated with the fix for 4665536, which will involve a re-implementation of the internal sun.misc.GC API that will no longer make full System.gc() invocations in order to satisfy the sun.misc.GC API's behavior of bounding GC latency. In the meantime, the workaround of setting the two system properties sun.rmi.dgc.client.gcInterval sun.rmi.dgc.server.gcInterval to higher values (than the default 60000 milliseconds/one minute) as appropriate for the application still applies. For most applications, sun.rmi.dgc.server.gcInterval can be reasonably set to Long.MAX_VALUE-1 (0x7FFFFFFFFFFFFFFE) unless the application cares about timely exiting of the server virtual machine (from no non-daemon threads running) when all exported remote objects become garbage collected. The sun.rmi.dgc.client.gcInterval setting can affect the timeliness of distributed garbage collection, so the higher it is set, the higher maximum latency can be expected for a server to know that a remote client has dropped references to the server's exported remote objects. ###@###.### 2002-04-29 *** PLEASE NOTE: *** The original recommendations above and in the "Work Around" section to set the "sun.rmi.dgc.server.gcInterval" system property to (exactly) 0x7FFFFFFFFFFFFFFF (i.e. Long.MAX_VALUE) can cause an IllegalStateException to be thrown from the sun.misc.GC$LatencyRequest.cancel method in a background thread of the RMI implementation's client-side DGC implementation, which will kill that thread and cause an exception trace like the following to be output: java.lang.IllegalStateException: Request already cancelled at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.fillInStackTrace(Compiled Code) at java.lang.Throwable.<init>(Compiled Code) at java.lang.Exception.<init>(Compiled Code) at java.lang.RuntimeException.<init>(RuntimeException.java:40) at java.lang.IllegalStateException.<init>(IllegalStateException.java:38) at sun.misc.GC$LatencyRequest.cancel(GC.java:213) at sun.rmi.transport.DGCClient$EndpointEntry.removeRefEntry(DGCClient.java:277) at sun.rmi.transport.DGCClient$EndpointEntry.processPhantomRefs(Compiled Code) at sun.rmi.transport.DGCClient$EndpointEntry.access$6(Compiled Code) at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(Compiled Code) at java.lang.Thread.run(Compiled Code) In some particular circumstances, this can prevent remote objects from being garbage collected when they should be. This problem occurs because sun.misc.GC$LatencyRequest treats Long.MAX_VALUE as a magic value internally. To avoid this problem, choose a different value that is also effectively infinite for the "sun.rmi.dgc.server.gcInterval" system property, such as 0x7FFFFFFFFFFFFFFE. ###@###.### 2002-08-01 The implementation of RFE 5025281 will provide another way to amelioriate the problems of this bug report: it provides another VM option to specify that explicit full GC requests should execute concurrently, rather than by stopping the world. [As described above, the ultimate solution to this bug still depends on something like RFE 4665536, which is to re-implement the internal sun.misc.GC API (used by RMI's DGC implementation) in a more sophisticated fashion than its current, blunt approach of invoking System.gc() when necessary.] ###@###.### 2004-09-02 See also RFE 6200091, which is to increase the default values of the sun.rmi.dgc.*.gcInterval system properties, which should help to amelioriate the problems of this bug report in the "out of the box" case, when no special configuration is done (such as setting the properties to higher values). ###@###.### 2004-12-04 00:27:38 GMT
04-12-2004

WORK AROUND The original workaround was incomplete. Here is a more complete description: ====================================================================== Problem: In order to operate in timely fashion, RMI Distributed Garbage Collection attempts to run local asynchronous garbage collection periodically. As a result, full GC cycles run periodically regardless of any attempts the generational garbage collector makes to avoid them. Without the aid of this asynchronous local garbage collection activity, remote references that are unreachable in a client's VM may go undetected indefinitely, possibly preventing the corresponding remote objects from being garbage collected. In addition (although less significantly), unreachable remote objects may not be collected in a timely fashion, preventing a virtual machine from exiting when it otherwise would do so. Failure to collect remote objects wastes resources (which might lead to the eventual throwing of OutOfMemoryException on the server). The desire to maintain up-to-date awareness of all unreachable objects in a given VM conflicts with the goals of a generational garbage collector. Work around: To resolve this conflict, use RMI Sun specific system properties to decrease the frequency with which RMI requests client and server side async GC. Setting the following system property to Long.MAX_VALUE can be used to disable server side async GC for VMs that host long running object implementations. This setting will prevent RMI initiated GC from affecting performance of server applications: -Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe The following client-oriented GC property should also be set with the following trade-off in mind: the client GC interval should occur at the highest frequency that does not adversely affect application performance. However consideration must be given to the importance of timely server notification of the presence of unreferenced remote stubs in the client VM. The following setting uses a full GC period of one hour: -Dsun.rmi.dgc.client.gcInterval=3600000 Please note that for VMs which contain remote object implementations and remote references, both of the above system property settings should be used. Otherwise the gc request interval will still be the default client value (1 minute). ====================================================================== Name: boT120536 Date: 01/09/2001 WORKARAOUND: set RMI gcInterval very large with the following java environment variable: sun.rmi.dgc.server.gcInterval=2000000000 I have tested this workaround only with servers which, like my example, export only a single remote object, but create and destroy millions of object on the server itself. I do not know how a server exporting thousands of remote objects will behave without forced garbage collection. ====================================================================== *** PLEASE NOTE: *** The original recommendations above and in the "Evaluation" section to set the "sun.rmi.dgc.server.gcInterval" system property to (exactly) 0x7FFFFFFFFFFFFFFF (i.e. Long.MAX_VALUE) can cause an IllegalStateException to be thrown from the sun.misc.GC$LatencyRequest.cancel method in a background thread of the RMI implementation's client-side DGC implementation, which will kill that thread and cause an exception trace like the following to be output: java.lang.IllegalStateException: Request already cancelled at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.fillInStackTrace(Compiled Code) at java.lang.Throwable.<init>(Compiled Code) at java.lang.Exception.<init>(Compiled Code) at java.lang.RuntimeException.<init>(RuntimeException.java:40) at java.lang.IllegalStateException.<init>(IllegalStateException.java:38) at sun.misc.GC$LatencyRequest.cancel(GC.java:213) at sun.rmi.transport.DGCClient$EndpointEntry.removeRefEntry(DGCClient.java:277) at sun.rmi.transport.DGCClient$EndpointEntry.processPhantomRefs(Compiled Code) at sun.rmi.transport.DGCClient$EndpointEntry.access$6(Compiled Code) at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(Compiled Code) at java.lang.Thread.run(Compiled Code) In some particular circumstances, this can prevent remote objects from being garbage collected when they should be. This problem occurs because sun.misc.GC$LatencyRequest treats Long.MAX_VALUE as a magic value internally. To avoid this problem, choose a different value that is also effectively infinite for the "sun.rmi.dgc.server.gcInterval" system property, such as 0x7FFFFFFFFFFFFFFE. ###@###.### 2002-08-01
01-08-2002