JDK-6181943 : (reliability) referential integrity loophole for remote references passed as arguments
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.rmi
  • Affected Version: 5.0
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2004-10-20
  • Updated: 2017-05-16
  • Resolved: 2005-09-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6
6 betaFixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
JDK          :   5.0 u1 b04 and 5.0 fcs
VM           :   default
switch/Mode  :   default
Platform[s]  :   SLEs 9 AMD64

How to reproduce:
====================

1) cp /net/cady.sfbay/export/dtf/unified/knight-ws/suites/rmi_reliability/tiger/rmi_reliability.tar to your local dir

2) install local copy of jdk
2) cd rmi
3) launch_reliability.ksh  jdk  wordir_dir result_dir shell_location ./src 1 -showversion &

   (ex.  launch_reliability.ksh `pwd`/j2sdk1.5.0_01 `pwd`/workdir `pwd`/result /usr/bin/ksh  `pwd`/.src 1 -showversion)

   There is more detail on README.

4)  It is set to be complete in 1 hour, but about 30 min java.lang.RuntimeException will appear.  See $RESULTDIR/log.juicer.appleuser

 
Note:
Modify the following file  will let you have pass more than one VM flags
./launch_reliability.ksh
  From:
     VMOPTS=$7
   To:
     shift 6
     VMOPTS=$*

./src/scripts/run_bench.ksh and ./src/scripts/run_juicer.ksh
  From:
     VMOPTS=$6
    To:
     shift 5
     VMOPTS=$* 

Error log
----------
ava version "1.5.0_01-ea"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_01-ea-b04)
Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_01-ea-b04, mixed mode)

Oct 19, 2004 5:02:31 PM AppleUserImpl main
INFO: Application server must be started in separate process
Oct 19, 2004 5:02:31 PM AppleUserImpl main
INFO: Waiting for application server process to start
Oct 19, 2004 5:02:40 PM AppleUserImpl main
INFO: Test starting
Oct 19, 2004 5:02:40 PM AppleUserImpl main
INFO: Waiting 1 hours for test to complete or exception to be thrown
Oct 19, 2004 5:27:52 PM AppleUserImpl main
INFO: TEST FAILED
java.lang.RuntimeException: TEST FAILED: juicer server reported an exception
        at AppleUserImpl.main(AppleUserImpl.java:326)
Caused by: java.rmi.ServerException: RemoteException occurred in server thread; nested exception is:
        java.rmi.NoSuchObjectException: no such object in table
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:325)
        at sun.rmi.transport.Transport$1.run(Transport.java:153)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:149)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
        at java.lang.Thread.run(Thread.java:595)
        at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
        at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:126)
        at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:179)
        at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132)
        at $Proxy2.recurse(Unknown Source)
        at AppleUserImpl$AppleUserThread.run(AppleUserImpl.java:166)
Caused by: java.rmi.NoSuchObjectException: no such object in table
        at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:247)
        at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:223)
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:126)
        at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:179)
        at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:132)
        at $Proxy3.recurse(Unknown Source)
        at OrangeImpl.recurse(OrangeImpl.java:33)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:294)
        at sun.rmi.transport.Transport$1.run(Transport.java:153)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.rmi.transport.Transport.serviceCall(Transport.java:149)
        at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:460)
        at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:701)
        at java.lang.Thread.run(Thread.java:595)
Oct 19, 2004 5:27:52 PM AppleUserImpl main
INFO: Test finished
Oct 19, 2004 5:27:52 PM AppleUserImpl main
INFO: Test duration was 1511 seconds (0 hours)
###@###.### 10/20/04 17:08 GMT

Comments
EVALUATION Regarding the JDC/SDN comment of 26-JAN-2006: If the NoSuchObjectException only happens later, after the remote object has been looked up in the registry and successfully invoked, then the problem would not seem to be caused by this bug. Some related discussion in the following RMI-USERS posts might be useful: http://archives.java.sun.com/cgi-bin/wa?A2=ind0509&L=rmi-users&P=617 http://archives.java.sun.com/cgi-bin/wa?A2=ind0512&L=rmi-users&P=3747 http://archives.java.sun.com/cgi-bin/wa?A2=ind0601&L=rmi-users&P=1985
27-01-2006

EVALUATION Fixed by leveraging the DGCAckHandler mechanism that is already used to ensure referential integrity for remote objects/references passed in RMI return values (for which it has always been obvious that the returned reference could otherwise become locally unreachable). The "juicer" reliability test now passes on machines for which it had been failing as described above. We do wonder how safely grounded this mechanism is against potential VM optimizations in the future, however, as a matter of language/VM specification.
16-09-2005

SUGGESTED FIX See Comments. ###@###.### 2004-11-24 01:12:56 GMT
24-11-2004

WORK AROUND Force local reachability of remote object/reference after remote invocation in which it is passed as an argument. ###@###.### 2004-11-24 01:12:56 GMT
24-11-2004

EVALUATION We are investigating this. ###@###.### 10/20/04 18:23 GMT I have been able to reproduce this failure multiple times on the amd64 machine provided by the submitter (jck-amd3.sfbay). Unfortunately, however, I have not yet been able to reproduce it with instrumentation to the JDK or to the test that would be helpful in diagnosing the failure. This NoSuchObjectException is puzzling, because it does not appear to me that it can be explained by failure of DGC remote communication-- the remote object that is not being found is an OrangeEchoImpl instance (in the AppleUserImpl VM), which is strongly referenced in the thread (AppleUserImpl.AppleUserThread) that initiates the sequence of remote invocations that ends up throwing the NoSuchObjectException attempting to invoke the same OrangeEchoImpl remotely. The test never explicitly unexports OrangeEchoImpl instances, so they should only get (implicitly) unexported because of garbage collection, as determined by weak reference notifications. ###@###.### 10/22/04 23:02 GMT I have been able to reproduce this failure on Solaris 9 SPARC (a dual processor workstation, terrier.east) and the 32-bit Server VM with 1.5.0-b64, so it no longer seems to be an AMD64-specific issue. When there is no instrumentation to the test code or JDK code, it has failed within an hour (typically 15-25 minutes) for me every time that I have run it on this machine with that configuration. It does not fail with the Client VM, or with 1.4.2 and the Server VM. Note that in order to run this test with 1.4.2, I had to modify the scripts to run rmic to generate stub classes; using the same modified scripts with 5.0 (thus using rmic-generated stub classes instead of dynamic proxy classes) still failed with the Server VM, so the problem does not seem specific to use of dynamic proxy classes for RMI stubs in 5.0. It still appears that an OrangeEchoImpl object created in AppleUserThread.run, assigned to the local variable "echo", and passed to the Orange.recurse invocation gets garbage collected (or at least a weak reference to it gets enqueued) before that invocation returns, which should not be possible as far as I understand. Any instrumentation of the test code to attempt to read the "echo" variable after the Orange.recurse invocation seems to prevent the failure from occurring. ###@###.### 10/26/04 21:24 GMT I have attached to this CR a somewhat pared-down version of the test case, removing the two layers of scripts (and the irrelevant, parallel run of the benchmark suite), which should make it slightly easier to debug. See Comments for details on how to run this version. To summarize the current knowledge about this bug: - It is reproducible with 5.0, 5.0u1, and 6.0 (b12), but not with 1.4.2. - It is not AMD64-specific; it is reproducible on Solaris SPARC too. - It is specific to the Server VM, and only seems reproducible on multiprocessor machines. - It seems to be caused by an object getting GCed that would ("naively") seem to be strongly reachable from a local variable on the stack (see Comments for some more analysis on this point). ###@###.### 2004-11-17 21:08:38 GMT It seems that this bug is explainable by clever but valid VM optimizations, which violate an assumption that is currently made by RMI in its attempt to guarantee referential integrity of remote references passed as or within arguments to remote invoocations. A conservative fix (at least for the now-usual case of the 1.2 stub protocol) could be to add something to the implementation of UnicastRef.invoke to force strong reachability of its "params" argument until its RemoteCall.executeCall invocation has completed; see Comments for more details. ###@###.### 2004-11-18 08:22:45 GMT This bug is a reliability concern. It is a regression in 5.0 in that it was previously latent, apparently not exposed by pre-5.0 VM implementations. Committing to fix for Mustang. ###@###.### 2004-11-24 01:12:56 GMT I have modified the release in the SR to be 5.0 instead of 5.0u1, because this bug is not a 5.0u1 regression. ###@###.### 2004-12-13 21:23:17 GMT
20-10-2004