JDK-8220295 : sun/tools/jps/TestJps.java still timing out
  • Type: Bug
  • Component: core-svc
  • Sub-Component: tools
  • Affected Version: 13
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-03-07
  • Updated: 2019-04-16
  • Resolved: 2019-03-25
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 13
13 b14Fixed
Related Reports
Relates :  
Relates :  
Description
JDK-8210106 attempted to fix a timeout issue with this test by increasing the test's timeout to 360. It was pointed out that the test can sometimes take over 2 minutes to run, so an increase in the timeout might resolve the issue. It has not, and now the test takes 24 or 60 minutes before it times out, depending on whether test.timeout.factor is set to 4 or 10. I'm not sure what is controlling test.timeout.factor, but I'm seeing both these values used in failed test runs.

As part of the fix for this test, I suggest we also roll back the JDK-8210106 and set the timeout to something reasonable. timeout=360 usually means 60 minutes, but sometimes 24. The longest successful test run I can see are on windowx-x64, which frequently is over 7 minutes, but always well under 8. So 10 minutes would be a good target timeout, which equates to timeout=144.
Comments
Testing now with a larger list of directories to exclude from concurrent testing. diff --git a/test/jdk/TEST.ROOT b/test/jdk/TEST.ROOT --- a/test/jdk/TEST.ROOT +++ b/test/jdk/TEST.ROOT @@ -22,7 +22,11 @@ javax/management sun/awt sun/java2d javax/xml/jaxp/testng/validation java/lang/ProcessHandle # Tests that cannot run concurrently -exclusiveAccess.dirs=java/rmi/Naming java/util/prefs sun/management/jmxremote sun/tools/jstatd sun/security/mscapi java/util/stream java/util/Arrays/largeMemory java/util/BitSet/stream javax/rmi +exclusiveAccess.dirs=java/rmi/Naming java/util/prefs sun/management/jmxremote \ +sun/tools/jstatd sun/tools/jcmd sun/tools/jhsdb sun/tools/jhsdb/heapconfig \ +sun/tools/jinfo sun/tools/jmap sun/tools/jps sun/tools/jstack sun/tools/jstat \ +com/sun/tools/attach sun/security/mscapi java/util/stream java/util/Arrays/largeMemory \ +java/util/BitSet/stream javax/rmi # Group definitions groups=TEST.groups
18-03-2019

Looking at the history of JDK-8134420, I see some past attempts to add sun/tools/jps to the exclusiveAccess.dirs list but not getting the expected non-concurrent test run. e.g. the property seems to prevent multiple tests from the same directory from running concurrently.
18-03-2019

Looking at a couple of the logs from timeout runs, there are not so many infrastructure processes as I first believed. There are a number of other concurrent tests being run, e.g. jinfo, jmap, jstat, etc. Trying an experiment now with sun/tools added to the exclusiveAccess.dirs.
08-03-2019

I'm not so sure what normal running conditions are. We have a number of windows machines where it take about 7m30s. Is this normal? Do we have to allow for that much runtime without the 4x? If so, what's the point of using 4x or 10x to account for the slower machines when these are the slower machines. I haven't looked into the failure at all, but the hang being due to tying to attach to an infra process or some other hung java process is a good guess as to the cause.
08-03-2019

Tests should have timeouts set so that under normal running conditions the test completes. The timeout factor should be applied when the environment is known to require additional time. We should not be using 4 x timeout to set the expected running time. Do we have any information about the cases where the test is currently timing out? I believe we will find an infrastructure process causing this test to hang. It may be best to restructure this test to not run against all jps visible processes, and instead target a specific lingering app.
08-03-2019

Yes, /timeout is seconds and was changed to 360 = 6 minutes. Then the timeout factor multiplies that by 10x or 4x, so we get at least 60 or 24 minutes. These timeouts are not due to variability. If you look at successful runs, of which there are nearly 5500, the longest are taking between 7 and 8 minutes. If you look at the timeouts, they are all after 24 minutes or 60 minutes. If variability was this issue, I would expect to see a number of tests passing with much long runtimes than just 7-8 minutes.
07-03-2019

/timeout is seconds -waittime is minutes timeout factor is set to 10 for known slower platforms or build targets. e.g. debug This test's variability can be due to environmental circumstances. e.g. the test is applied to all jps visible processes If the infrastructure java processes could be started with -XX:-UsePerfData they would no longer be visible to the jps tests.
07-03-2019