JDK-8205526 : tests timeout intermittently in "agentvm" mode
  • Type: Bug
  • Component: hotspot
  • Sub-Component: test
  • Affected Version: 11,12
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: generic
  • CPU: generic
  • Submitted: 2018-06-22
  • Updated: 2018-12-15
  • Resolved: 2018-12-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdResolved
Related Reports
Duplicate :  
Duplicate :  
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
During my analysis of jdk/jdk CI test failures, I'm seeing a
number of test timeouts that appear to be caused by some
failure in JTREG's "agentvm" mode.

This bug is intended as a gathering place for sightings of
test time outs in "agentvm" mode only! In particular, the JTREG
test needs to fail with a message like this one:

result: Error. Agent error: java.lang.Exception: Agent 3 timed out with a timeout of 1200 seconds; check console log for any additional details

    In the above message, "Agent 3" is an example.

Please DO NOT add sightings to this bug unless they are
hangs related to "agentvm" mode and they timeout in
about 20 minutes (1200 seconds).
Comments
This bug is not an issue by itself. It is just confusing reporting by jtreg CODETOOLS-7902346 It is needed to investigate each timeout separately.
15-12-2018

I opened CODETOOLS-7902346 to improve jtreg diagnostic message. I am going to close this issue so people might investigate timeouts and file test specific issues.
20-11-2018

I have reopened this bug: JDK-8204529 gc/TestAllocateHeapAtMultiple.java fail with Agent 7 timed out and assigned it to myself.
09-11-2018

I think it does.
08-11-2018

does it make a sense to file the separate issue for gc/TestAllocateHeapAtMultiple.java failure? It is a specific problem with one test.
08-11-2018

From slack (looking at one of Erik's test jobs): Interesting results for gc/TestAllocateHeapAtMultiple.java. First it seems the timeout is right on the edge of the required time as the test completed but also timed-out (really must file a jtreg bug on that problem!). Second the final VM execution in the test seems to take ~500x longer to run than the previous ones based on the timestamp in the log message: [5.129s] vs. [0.0nns] So UseNUMA needs some investigation on big sparc boxes.
08-11-2018

[~jjg] Seems that this error " Error. Agent error: java.lang.Exception: Agent .." is generated for any timeout of action in agentvm. mainAction runs action which sends the command to the agent and wait for results. If test main fails then this error is generated. Also timeouthandler is invoked and all data is dumped. See private Status doAction(...) in http://hg.openjdk.java.net/code-tools/jtreg/file/tip/src/share/classes/com/sun/javatest/regtest/exec/Agent.java#l241 The timeout means that test haven't got any response from 'main' action and going to kill agent using timeouthandler. So I would not says that behavior is incorrect except confusing message mentioned Agent. It should contain the name of the action which timed out instead of "Agent". So it is more clear what exactly timed out.
07-11-2018

as "default" jtreg timeout factor was changed to 4 from 10, we should see 'Agent N timed out with a timeout of 480 seconds'.
23-10-2018

Has the magic timeout value changed with the recent change to the default jtreg timeout factor?
22-10-2018

Added maintainer_pain label because most of these seem to be unrelated to any known product or test problem, but require effort to examine each such failure to decide whether it is "interesting".
22-10-2018

Spotted in the jdk12-jdk-378-tier1 job set: tools/javac/file/T8150475.java timed out twice on two different test machines. Here's a snippet of the test log: #section:compile ----------messages:(154/10689)---------- command: compile -sourcepath / /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk12-jdk.378/src.full/open/test/langtools/tools/javac/file/T8150475.java reason: User specified action: run compile -sourcepath / T8150475.java Mode: agentvm Agent id: 1 Timeout refired 1200 times Timeout information: Running jstack on process 27979 2018-09-10 13:39:04 Full thread dump OpenJDK 64-Bit Server VM (12-internal+0-jdk12-jdk.378 mixed mode): Threads class SMR info: _java_thread_list=0x00007fea34001c50, length=12, elements={ 0x00007fea7c012000, 0x00007fea7c349800, 0x00007fea7c34d000, 0x00007fea7c36e800, 0x00007fea7c370800, 0x00007fea7c373000, 0x00007fea7c375000, 0x00007fea7c3d2000, 0x00007fea7c48d000, 0x00007fea7c4a1000, 0x00007fea1416a000, 0x00007fea34001000 } "main" #1 prio=5 os_prio=0 cpu=7922.08ms elapsed=1527.46s tid=0x00007fea7c012000 nid=0x6d4f runnable [0x00007fea839c7000] java.lang.Thread.State: RUNNABLE at sun.nio.fs.UnixNativeDispatcher.stat1(java.base@12-internal/Native Method) at sun.nio.fs.UnixNativeDispatcher.stat(java.base@12-internal/UnixNativeDispatcher.java:308) at sun.nio.fs.UnixFileSystemProvider.isDirectory(java.base@12-internal/UnixFileSystemProvider.java:506) at java.nio.file.Files.isDirectory(java.base@12-internal/Files.java:2230) at com.sun.tools.javac.file.JavacFileManager$DirectoryContainer.list(jdk.compiler@12-internal/JavacFileManager.java:463) at com.sun.tools.javac.file.JavacFileManager.list(jdk.compiler@12-internal/JavacFileManager.java:714) at com.sun.tools.javac.code.ClassFinder.list(jdk.compiler@12-internal/ClassFinder.java:734) at com.sun.tools.javac.code.ClassFinder.scanUserPaths(jdk.compiler@12-internal/ClassFinder.java:684) at com.sun.tools.javac.code.ClassFinder.fillIn(jdk.compiler@12-internal/ClassFinder.java:551) at com.sun.tools.javac.code.ClassFinder.complete(jdk.compiler@12-internal/ClassFinder.java:299) at com.sun.tools.javac.code.ClassFinder$$Lambda$76/0x000000010010d440.complete(jdk.compiler@12-internal/Unknown Source) at com.sun.tools.javac.code.Symtab.lambda$addRootPackageFor$8(jdk.compiler@12-internal/Symtab.java:782) at com.sun.tools.javac.code.Symtab$$Lambda$78/0x000000010010cc40.complete(jdk.compiler@12-internal/Unknown Source) at com.sun.tools.javac.code.Symbol.complete(jdk.compiler@12-internal/Symbol.java:642) at com.sun.tools.javac.comp.Enter.visitTopLevel(jdk.compiler@12-internal/Enter.java:355) at com.sun.tools.javac.tree.JCTree$JCCompilationUnit.accept(jdk.compiler@12-internal/JCTree.java:535) at com.sun.tools.javac.comp.Enter.classEnter(jdk.compiler@12-internal/Enter.java:286) at com.sun.tools.javac.comp.Enter.classEnter(jdk.compiler@12-internal/Enter.java:301) at com.sun.tools.javac.comp.Enter.complete(jdk.compiler@12-internal/Enter.java:577) at com.sun.tools.javac.comp.Enter.main(jdk.compiler@12-internal/Enter.java:561) at com.sun.tools.javac.main.JavaCompiler.enterTrees(jdk.compiler@12-internal/JavaCompiler.java:1066) at com.sun.tools.javac.main.JavaCompiler.compile(jdk.compiler@12-internal/JavaCompiler.java:937) at com.sun.tools.javac.main.Main.compile(jdk.compiler@12-internal/Main.java:311) at com.sun.tools.javac.main.Main.compile(jdk.compiler@12-internal/Main.java:170) at com.sun.tools.javac.Main.compile(jdk.compiler@12-internal/Main.java:75) at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@12-internal/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@12-internal/Method.java:566) at com.sun.javatest.regtest.agent.RegressionCompileCommand.run(RegressionCompileCommand.java:190) at com.sun.javatest.regtest.agent.CompileActionHelper.runCompile(CompileActionHelper.java:92) at com.sun.javatest.regtest.agent.AgentServer.doCompile(AgentServer.java:220) at com.sun.javatest.regtest.agent.AgentServer.run(AgentServer.java:190) at com.sun.javatest.regtest.agent.AgentServer.main(AgentServer.java:65) <snip> and #section:compile ----------messages:(142/9783)---------- command: compile -sourcepath / /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk12-jdk.378/src.full/open/test/langtools/tools/javac/file/T8150475.java reason: User specified action: run compile -sourcepath / T8150475.java Mode: agentvm Agent id: 4 Timeout refired 1200 times Timeout information: Running jstack on process 32501 2018-09-10 13:37:15 Full thread dump Java HotSpot(TM) 64-Bit Server VM (12-internal+0-jdk12-jdk.378 mixed mode): Threads class SMR info: _java_thread_list=0x00007fc510001c50, length=11, elements={ 0x00007fc558012000, 0x00007fc558258000, 0x00007fc55825b800, 0x00007fc55827d000, 0x00007fc55827f000, 0x00007fc558281000, 0x00007fc558283000, 0x00007fc5582b8000, 0x00007fc55835c000, 0x00007fc5583d0000, 0x00007fc510001000 } "main" #1 prio=5 os_prio=0 cpu=10365.71ms elapsed=1597.58s tid=0x00007fc558012000 nid=0x7ef7 runnable [0x00007fc55f782000] java.lang.Thread.State: RUNNABLE at sun.nio.fs.UnixNativeDispatcher.stat1(java.base@12-internal/Native Method) at sun.nio.fs.UnixNativeDispatcher.stat(java.base@12-internal/UnixNativeDispatcher.java:308) at sun.nio.fs.UnixFileSystemProvider.isDirectory(java.base@12-internal/UnixFileSystemProvider.java:506) at java.nio.file.Files.isDirectory(java.base@12-internal/Files.java:2230) at com.sun.tools.javac.file.JavacFileManager$DirectoryContainer.list(jdk.compiler@12-internal/JavacFileManager.java:463) at com.sun.tools.javac.file.JavacFileManager.list(jdk.compiler@12-internal/JavacFileManager.java:714) at com.sun.tools.javac.code.ClassFinder.list(jdk.compiler@12-internal/ClassFinder.java:734) at com.sun.tools.javac.code.ClassFinder.scanUserPaths(jdk.compiler@12-internal/ClassFinder.java:684) at com.sun.tools.javac.code.ClassFinder.fillIn(jdk.compiler@12-internal/ClassFinder.java:551) at com.sun.tools.javac.code.ClassFinder.complete(jdk.compiler@12-internal/ClassFinder.java:299) at com.sun.tools.javac.code.ClassFinder$$Lambda$76/0x000000010010d440.complete(jdk.compiler@12-internal/Unknown Source) at com.sun.tools.javac.code.Symtab.lambda$addRootPackageFor$8(jdk.compiler@12-internal/Symtab.java:782) at com.sun.tools.javac.code.Symtab$$Lambda$78/0x000000010010cc40.complete(jdk.compiler@12-internal/Unknown Source) at com.sun.tools.javac.code.Symbol.complete(jdk.compiler@12-internal/Symbol.java:642) at com.sun.tools.javac.comp.Enter.visitTopLevel(jdk.compiler@12-internal/Enter.java:355) at com.sun.tools.javac.tree.JCTree$JCCompilationUnit.accept(jdk.compiler@12-internal/JCTree.java:535) at com.sun.tools.javac.comp.Enter.classEnter(jdk.compiler@12-internal/Enter.java:286) at com.sun.tools.javac.comp.Enter.classEnter(jdk.compiler@12-internal/Enter.java:301) at com.sun.tools.javac.comp.Enter.complete(jdk.compiler@12-internal/Enter.java:577) at com.sun.tools.javac.comp.Enter.main(jdk.compiler@12-internal/Enter.java:561) at com.sun.tools.javac.main.JavaCompiler.enterTrees(jdk.compiler@12-internal/JavaCompiler.java:1066) at com.sun.tools.javac.main.JavaCompiler.compile(jdk.compiler@12-internal/JavaCompiler.java:937) at com.sun.tools.javac.main.Main.compile(jdk.compiler@12-internal/Main.java:311) at com.sun.tools.javac.main.Main.compile(jdk.compiler@12-internal/Main.java:170) at com.sun.tools.javac.Main.compile(jdk.compiler@12-internal/Main.java:75) at jdk.internal.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@12-internal/DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(java.base@12-internal/Method.java:566) at com.sun.javatest.regtest.agent.RegressionCompileCommand.run(RegressionCompileCommand.java:190) at com.sun.javatest.regtest.agent.CompileActionHelper.runCompile(CompileActionHelper.java:92) at com.sun.javatest.regtest.agent.AgentServer.doCompile(AgentServer.java:220) at com.sun.javatest.regtest.agent.AgentServer.run(AgentServer.java:190) at com.sun.javatest.regtest.agent.AgentServer.main(AgentServer.java:65) <snip> Please note in both of these failures the timeout at 20 minutes occurred in the compilation phase (javac).
10-09-2018

Hi [~jwang] Joy, since 'this bug is intended as a gathering place for sightings of test time outs in "agentvm" mode only', I don't think there is a correct person for this issue. [~dcubed] might be a good candidate, but I highly doubt he wants this on his list (otherwise I'd expect him to assign it to himself at filling time)
14-08-2018

The most recent sighting in MDash is from jdk12-jdk-61 and it shows the test is still running in "agentvm" mode: ----------messages:(5/256)---------- command: compile /scratch/mesos/jib-master/install/jdk12-jdk.61/src.full/closed/test/hotspot/jtreg/compiler/c2/TestThreadInterruptedAndNotified.java reason: .class file out of date or does not exist Mode: agentvm Agent id: 4 elapsed time (seconds): 15.646 <snip> ----------messages:(8/285)---------- command: driver compiler.c2.TestThreadInterruptedAndNotified reason: User specified action: run driver compiler.c2.TestThreadInterruptedAndNotified Mode: agentvm Agent id: 7 Timeout refired 1200 times Timeout information: --- Timeout information end. elapsed time (seconds): 1263.418 <snip> result: Error. Agent error: java.lang.Exception: Agent 7 timed out with a timeout of 1200 seconds; check console log for any additional details test result: Error. Agent error: java.lang.Exception: Agent 7 timed out with a timeout of 1200 seconds; check console log for any additional details So while "driver" is mentioned above, it looks like the test is still running in "agentvm" mode to me. That failure was linked to JDK-8207226.
13-07-2018

The most recent sighting of: compiler/c2/TestThreadInterruptedAndNotified.java in this bug and in the MDash history is the same 2018.07.06 sighting added by Felix. That entry shows that the test was still running "agentvm" mode.
13-07-2018

Copied from JDK-8204529 Test: gc/TestAllocateHeapAtMultiple.java Error: result: Error. Agent error: java.lang.Exception: Agent 7 timed out with a timeout of 1200 seconds; check console log for any additional details
29-06-2018

[~mikael] - why did you tag this one as 'testbug'?
28-06-2018

I'm unclear what is being tracked here. As we usually/always/often(?) run in agentvm mode won't any timeout be reported in a way that includes "agentvm"??
24-06-2018

See the specific message example in the description note. If I run into any variations as I'm analyzing CI job-sets, I'll add more.
24-06-2018

[~jjg] - When JTREG opens its ServerSockets in "agentvm" mode: - does it use port 0 to get an ephemeral port? - is it using the hostname or host IP address? For opening your ServerSockets, can you use the loopback IP address instead?
22-06-2018

compiler/c2/TestThreadInterruptedAndNotified.java timeouts might be caused by a testbug, I have recently changed the test and removed it from the problem list.
22-06-2018

Is there any sense in any of these cases whether the test is slow, and the timeout is "genuine" or whether the test has in fact "completed" but jtreg is not noticing the fact?
22-06-2018

[~jjg] - Thought you might want to track this one. Nothing to do for now since I'm just gathering data.
22-06-2018