Bug ID: JDK-4447092 Relieve the server-side bottleneck in TCPTransport.run.

Type: Enhancement
Component: core-libs
Sub-Component: java.rmi
Affected Version: 1.3.0

Priority: P4
Status: Closed
Resolution: Duplicate
OS: generic
CPU: generic

Submitted: 2001-04-18
Updated: 2001-04-18
Resolved: 2001-04-18


Name: krC82822			Date: 04/18/2001


java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-C)
Java HotSpot(TM) Client VM (build 1.3.0-C, mixed mode)

According to RMI "folklore", the RMI RTS can only handle about 300 RMI calls per
second maximum.

One reason for this is that Java RMI has a major server-side bottleneck around
the site of the myServer.accept() call in TCPTransport.run().

For each incoming connection, the RMI RTS has to create a thread and do
housekeeping things (getting the client address and setting it into a
ThreadLocal, playing with the connected socket, &c) before the connection can
start communicating with the client and before the listening thread can proceed
to the next accept().

This is in fact the simplest possible server-side architecture. The bottleneck
can be relieved by a very little more sophistication in the design of the
listening engine.

The RMI RTS should pre-create a fixed number of threads, all of which are
looping around a myServer.accept()/ConnectionHandler.run() loop. In other words
they are initially all in the "accept" state. When one of them gets a connection
it immediately processes it directly without starting another thread, and
returns to the "accept" state when the connection terminates, only exiting when
the underlying ServerSocket is finally closed. This is all in addition to the
present processing which starts a new thread per accepted connection.

These fixed threads constitute a thread pool of fixed size, which could be
externally controllable, e.g. via a system property such as
sun.rmi.server.threadPoolSize, and which should have a default value of some
reasonable number like 4, 8, 16, ...

An incoming connection has an equal chance of being dispatched into any of the
concurrent myServer.accept() calls. Therefore the overhead presently imposed by
creating a thread per accept is reduced by threadPoolSize/(threadPoolSize+1):
for a pool of 4 threads, by 4/5 or 80%.

This is a very simple improvement to implement, basically similar to the NFSD
server architecture, which also relies on N concurrent accept()s on the same
passive socket (and N concurrent recv()s on the same datagram socket). In this
case there are concurrent threads rather than concurrent processes, but the
principle is the same.

This server architecture is described in Stevens, Unix Network Programming, vol.
i, 27.11, and the change proposed is equivalent to moving from Stevens' 27.10 to
27.11. Stevens says in 27.13 that on some platforms a mutex is required to
ensure that only one thread is actually inside accept() at a time, while other
platforms support concurrent accept()s. The PlainSocketImpl.accept() method is
synchronized, which seems to take care of this issue.

A more complex and better fix would be to return each ConnectionHandler thread
arising out of the current implementation to a dynamic thread pool when the
connection is closed, and to allocate new connection threads from the pool where
possible, rather than creating new threads (something like Stevens' 27.12). This
dynamic thread pool could have a maximum size or an idle expiry time, or both;
it might also reasonably have a *minimum* size (of 4 or 8 as above), which can
be enforced by not exiting on expiry if the minimum pool size has been reached.

However the 27.11-based solution is so simple to implement that I am requesting
it instead.

The only minor difficulty I can see is that you have to call Thread.setName()
some time in these threads, otherwise you lost the association between the
thread name and the client address for the threads in the fixed pool.
(Review ID: 120904) 
======================================================================

WORK AROUND Name: krC82822 Date: 04/18/2001 None. Suffer the present bottleneck. ======================================================================

11-06-2004

EVALUATION This report covers roughly the same problem area as 4420157, which is why it has been marked as a duplicate. Note that a simple fixed-size pool of accept threads is inappropriate for implementating an RMI server, because an RMI transport must always make a best effort to be capable of accepting another connection regardless of the calls in progress, in order to avoid starvation or deadlock-- the runtime cannot predict the behavior of remote method calls, which can vary widely, including how long they will take to execute and the arbitrary possible dependencies between them. Therefore, an approach based on a pool of accept threads would require a mechanism for expanding the pool when it becomes saturated (as well as for shrinking the pool when it becomes sparse to avoid hogging resources), thus adding additional synchronization overhead to the pooled accept threads. Furthermore, such a mechanism would be unlikely to address the primary focus of 4420157, which is that the RMI TCP transport does not make enough of an effort to overcome the limitations of certain OS's TCP maximum listen backlog sizes. Another related issue is that currently (since 1.2), RMI accept threads are placed in the system thread group (for greater protection against untrusted code), whereas ConnectionHandler threads are placed in a non-system thread group. Alas note that in JDK1.1, the RMI transport implementation did process connections in the same thread that accepted them for faster response time (always spawning a new thread for the next accept), but that was determined to perform even more (and unacceptably) poorly in reponse to bursts of connection attempts. Various possible improvements are under consideration for this area-- any improvement will likely involve using a thread pool for ConnectionHandler objects rather than always creating a new thread for each new one. peter.c.jones@east 2001-05-03

03-05-2001

Duplicate :	JDK-4420157 - (perf) heavy loads yield connection refused
Relates :	JDK-4976198 - (perf) Improve RMI Connection Handling Mechanism