JDK-8000545 : TEST_BUG: java/rmi/activation/Activatable/forceLogSnapshot failed
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.rmi
  • Affected Version: 8,9
  • Priority: P4
  • Status: Resolved
  • Resolution: Duplicate
  • Submitted: 2012-10-08
  • Updated: 2016-11-21
  • Resolved: 2016-11-21
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbd_minorResolved
Related Reports
Relates :  
Sub Tasks
JDK-8034999 :  
JDK-8035000 :  
JDK-8035001 :  
Description
This test failed in the b60-nightly-2012-10-04 nightly test run. Test environment is:

    test result: Error. Program `/Users/aurora/sandbox/jdk/bin/java' interrupted! (timed out?)
    javatestOS=SunOS 5.10 (x86)
    hostname=sc14160049

The relevant portion of the log file is as follows:

========================================

RMID: starting rmid on port #42830...
JAVAVM: command = [/Users/aurora/sandbox/jdk/jre/bin/java, -showversion, -Djava.security.policy=/Users/aurora/sandbox/testbase/test/java/rmi/activation/Activatable/forceLogSnapshot/rmid.security.policy, -Dsun.rmi.activation.snapshotInterval=10, -Djava.compiler=, -Dsun.rmi.server.activation.debugExec=true, -Dtest.src=/Users/aurora/sandbox/testbase/test/java/rmi/activation/Activatable/forceLogSnapshot, -Dtest.classes=/Users/aurora/sandbox/gresults/testoutput/jdk_rmi/JTwork/classes/java/rmi/activation/Activatable/forceLogSnapshot, -Djava.rmi.server.logLevel=v, sun.rmi.server.Activation, -log, /Users/aurora/sandbox/gresults/testoutput/jdk_rmi/JTwork/java/rmi/activation/Activatable/forceLogSnapshot/ForceLogSnapshot/./log, -port, 42830, -C-Dtest.src=/Users/aurora/sandbox/testbase/test/java/rmi/activation/Activatable/forceLogSnapshot, -C-Dtest.classes=/Users/aurora/sandbox/gresults/testoutput/jdk_rmi/JTwork/classes/java/rmi/activation/Activatable/forceLogSnapshot]

# java version "1.8.0-ea"
# Java(TM) SE Runtime Environment (build 1.8.0-ea-langtools-nightly-h340-20121004-b60-b00)
# Java HotSpot(TM) Server VM (build 25.0-b02, interpreted mode)
# 
JAVAVM: distant vm process running, after 100 milliseconds
ACTIVATION_LIBRARY: caught an exception trying to start rmid, last exception was: non-JRMP server at remote endpoint
java.rmi.ConnectIOException: non-JRMP server at remote endpoint
	at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:248)
	at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
	at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:341)
	at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
	at ActivationLibrary.rmidRunning(ActivationLibrary.java:124)
	at RMID.start(RMID.java:236)
	at RMID.start(RMID.java:196)
	at ForceLogSnapshot.main(ForceLogSnapshot.java:131)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:474)
	at com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94)
	at java.lang.Thread.run(Thread.java:722)

========================================

This stack trace is repeated 85 times, after which the error message "RMID: rmid still not started" is emitted, and the test times out and fails.

It's hard to say, but it seems like another server was present on that port, which caused the creation of RMID to fail. Instead of retrying on a different port, though, it retried repeatedly on the same port and failed each time until it timed out. This is part of the RMI testlibrary code so it should be easy to adjust the retry policy. ("Easy" in that no API or compatibility issues should arise.)
Comments
Some of the log is very log, even can not find the corresponding source code. I think this is the same issue with JDK-8168975
21-11-2016

The retry logic can probably be removed from ActivationLibrary.rmidRunning(). RMID.start() has its own retry logic, so it really wants rmidRunning() to try once and return a result. There is only one other use of rmidRunning(), which is in ActivationLibrary.DestroyThread, which is used when shutting down RMID from ActivationLibrary.rmidCleanup. The caller passes in an rmid instance, so we might just as well change callers to call rmid.destroy() instead. This knows whether the underlying Process is still running and waits for it to exit, so there's really no point in calling rmidRunning() when shutting it down.
15-02-2014

The retry logic in RMID.start() should also check regularly to see whether the actual process has terminated. If it has failed for some reason, currently it will retry uselessly for the full timeout period before giving up.
25-12-2013

Several things going on here, but no full explanation yet. The stack trace is repeated 85 times, yet the retry loop in ActivationLibrary.rmidRunning() loops 75 times. Why? The caller, RMID.start(), **also** has retry logic. So we have nested retries going on, which seems like a bad idea. The retry logic in both places intends to retry for a certain period of time, but it only accounts for the time spent sleeping, not actual elapsed time. In RMID.start() for instance there is the line waitTime -= rmidStartSleepTime; Even though rmidRunning() has taken a long time already, this isn't accounted for. The retry/timeout logic needs to recheck the current time. The intent is for rmidRunning() to retry every 100ms for a maximum of 75 times, which should take only about 7.5 sec. But there were actual 85 retries that took 2560 seconds, after which the test was timed out by jtreg! This indicates that each attempt is taking 30 seconds! The exception is a ConnectIOException thrown by TCPChannel.java line 248. This seems to indicate that something else was listening on the socket and didn't respond, so the socket read timed out. But this doesn't quite make sense, because the default timeout is 60 seconds, not 30 seconds. If the read timed out, this should result in a SocketTimeoutException, but there's no evidence of that here. (This would have been caught by the IOException block below.) Instead, it looks like we got a bad byte, and something else timed out for 30 seconds. Another problem is that the rmid subprocess probably failed to bind to this socket if there's something else there. Unfortunately its stdout and stderr streams are being consumed by the parent JVM (the test) and nothing ever echoes them, so we don't know what problems rmid ran into. We should figure out how to fix this. rmidRunning() should probably fail if it gets a ConnectIOException instead of retrying.
21-12-2012

I've attached the full JTR file.
21-12-2012

Affected test: java/rmi/activation/Activatable/forceLogSnapshot/ForceLogSnapshot.java
09-10-2012