JDK-8194057 : serviceability/dcmd/framework/InvalidCommandTest.java times out
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc
  • Affected Version: 10
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • Submitted: 2017-12-22
  • Updated: 2018-06-26
  • Resolved: 2018-03-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11Resolved
Related Reports
Duplicate :  
Relates :  
Description
Test
serviceability/dcmd/framework/InvalidCommandTest.java
times out on windows, macos, and linux.
Comments
Kim's failure is interesting in that since it was on linux, we have native stack traces from gdb, but the diagnostic tools were unable to attached to it and execute commands like jstack and jinfo. This is opposite of the windows failure which could show jstack output for the agent, but no native backtrace. I'm not sure if this is relevant, but the process running jcmd has the following backtrace for the last thread created (and all the threads before seem to be standard thread created on VM startup): Thread 79 (Thread 0x7f24dbc55700 (LWP 30106)): #0 0x00007f24daf7ea0d in read () from /lib64/libc.so.6 #1 0x00007f24daf0e9b0 in __GI__IO_file_underflow () from /lib64/libc.so.6 #2 0x00007f24daf0f93e in __GI__IO_default_uflow () from /lib64/libc.so.6 #3 0x00007f24daf0abce in getc () from /lib64/libc.so.6 #4 0x00007f24da2af528 in PerfMemory::attach(char const*, int, PerfMemory::PerfMemoryMode, char**, unsigned long*, Thread*) () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #5 0x00007f24da2a0fb2 in Perf_Attach () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #6 0x00007f24baf7088b in ?? () #7 0x0000000000000041 in ?? () #8 0x00007f23e64f77c8 in ?? () #9 0x0000000000000099 in ?? () #10 0x0000000000000000 in ?? () I don't know much about how PerfMemory works, and whether or not it is normal for it to be blocked on a read like this. Also, on the agent side (which is the one that executes the dcmd), I see the following. Thread 4 (Thread 0x7f96266c4700 (LWP 29706)): #0 0x00007f974bf774ad in accept () from /lib64/libpthread.so.0 #1 0x00007f9749d7dbd3 in LinuxAttachListener::dequeue() () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #2 0x00007f9749d7dd8a in AttachListener::dequeue() () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #3 0x00007f9749d7be30 in attach_listener_thread_entry(JavaThread*, Thread*) () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #4 0x00007f974ade21ea in JavaThread::thread_main_inner() () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #5 0x00007f974ade242a in JavaThread::run() () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #6 0x00007f974ab16caa in thread_native_entry(Thread*) () from /scratch/opt/mach5/mesos/work_dir/jib-master/install/jdk11-hs.58/linux-x64-debug.jdk/jdk-11/fastdebug/lib/server/libjvm.so #7 0x00007f974bf70dc5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f974b88e1ed in clone () from /lib64/libc.so.6 So it looks like the "Attach Listener" thread is waiting for the next attach request, which makes sense since the initial dcmd is done executing. This confirms the same findings on Windows above when looking at the java stack traces, which shows the "Attach Listener" to be in the expected state. The question still remains why on the jcmd side are we stuck at the following when we've already processed and printed the dcmd output (sent from the agent side), and agent side seems to be in the expected state waiting for the next dcmd: at sun.tools.attach.VirtualMachineImpl.read(jdk.attach@10-ea/Native Method) at sun.tools.attach.VirtualMachineImpl$SocketInputStream.read(jdk.attach@10-ea/VirtualMachineImpl.java:259) - locked <0x00000006c0200188> (a sun.tools.attach.VirtualMachineImpl$SocketInputStream) at sun.tools.attach.HotSpotVirtualMachine.readInt(jdk.attach@10-ea/HotSpotVirtualMachine.java:320) at sun.tools.attach.VirtualMachineImpl.execute(jdk.attach@10-ea/VirtualMachineImpl.java:191) at sun.tools.attach.HotSpotVirtualMachine.executeCommand(jdk.attach@10-ea/HotSpotVirtualMachine.java:301) at sun.tools.attach.HotSpotVirtualMachine.executeJCmd(jdk.attach@10-ea/HotSpotVirtualMachine.java:283) at sun.tools.jcmd.JCmd.executeCommandForPid(jdk.jcmd@10-ea/JCmd.java:124) at sun.tools.jcmd.JCmd.main(jdk.jcmd@10-ea/JCmd.java:98) It seems either the socket was never properly closed, or the jcmd process somehow missed it.
26-06-2018

Closing as a duplicate again. See my comment in JDK-8166642. Will file new CRs to get things clean up and the fix applied properly.
15-03-2018

Regarding JDK-8166642 and Windows, there were only two reported sightings in that CR, and both were Windows, thus the comment. However, many more sightings were posted here instead, and on all platforms, so I would not conclude that JDK-8166642 was actually Windows only.
15-03-2018

Recurred after fix for JDK-8166642, invalidating the duplicate resolution.
15-03-2018

The fix for JDK-8166642 solves this issue as well. Closing this as a duplicate of JDK-8166642.
14-03-2018

Just to summarize,the test spawns a jcmd process and is stuck waiting for it to complete: at java.lang.Object.wait(java.base@10-ea/Native Method) - waiting on <no object reference available> at java.lang.Object.wait(java.base@10-ea/Object.java:328) at java.lang.ProcessImpl.waitFor(java.base@10-ea/ProcessImpl.java:494) - waiting to re-lock in wait() <0x000000011f7514c8> (a java.lang.ProcessImpl) at jdk.test.lib.process.ProcessTools.getOutput(ProcessTools.java:97) at jdk.test.lib.process.OutputAnalyzer.<init>(OutputAnalyzer.java:47) at jdk.test.lib.process.ProcessTools.executeProcess(ProcessTools.java:397) at jdk.test.lib.dcmd.JcmdExecutor.executeImpl(JcmdExecutor.java:49) at jdk.test.lib.dcmd.CommandExecutor.execute(CommandExecutor.java:61) at jdk.test.lib.dcmd.CommandExecutor.execute(CommandExecutor.java:43) at InvalidCommandTest.run(InvalidCommandTest.java:45) at InvalidCommandTest.mainClass(InvalidCommandTest.java:56) The jcmd process is stuck here: at sun.tools.attach.VirtualMachineImpl.read(jdk.attach@10-ea/Native Method) at sun.tools.attach.VirtualMachineImpl$SocketInputStream.read(jdk.attach@10-ea/VirtualMachineImpl.java:259) - locked <0x00000006c0200188> (a sun.tools.attach.VirtualMachineImpl$SocketInputStream) at sun.tools.attach.HotSpotVirtualMachine.readInt(jdk.attach@10-ea/HotSpotVirtualMachine.java:320) at sun.tools.attach.VirtualMachineImpl.execute(jdk.attach@10-ea/VirtualMachineImpl.java:191) at sun.tools.attach.HotSpotVirtualMachine.executeCommand(jdk.attach@10-ea/HotSpotVirtualMachine.java:301) at sun.tools.attach.HotSpotVirtualMachine.executeJCmd(jdk.attach@10-ea/HotSpotVirtualMachine.java:283) at sun.tools.jcmd.JCmd.executeCommandForPid(jdk.jcmd@10-ea/JCmd.java:124) at sun.tools.jcmd.JCmd.main(jdk.jcmd@10-ea/JCmd.java:98) It looks like the jcmd process attaches back to the test process. The log contains: Executing command '[/scratch/mesos/jib-master/install/jdk/10/40/macosx-x64-debug.jdk/jdk-10/fastdebug/bin/jcmd, 59560, -f, /var/folders/0d/6mr3vc451w5_qn6n8cbcb36w0000mw/T/input11589079248907392426jcmd]' And 59560 is the PID of the test process. The "Attach Listener" thread in that process seems to be in the expected state (note we know from the output that the attach already happened and the invalid dcmd executed): "Attach Listener" #17 daemon prio=9 os_prio=31 tid=0x00007f957f016000 nid=0x8903 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE JavaThread state: _thread_blocked Thread: 0x00007f957f016000 [0x8903] State: _at_safepoint _has_called_back 0 _at_poll_safepoint 0 JavaThread state: _thread_blocked So why is the jcmd process still trying to read? It's like it doesn't know the socket was closed.
24-01-2018

I did the following to try to reproduce: $ mach5 remote-test --test-repeat 100 --test open/test/hotspot/jtreg/:hotspot_serviceability -b windows-x64 -a "-Xmixed -XX:+TieredCompilation -XX:+CreateCoredumpOnCrash -XX:+IgnoreUnrecognizedVMOptions -XX:+UseSerialGC" --build-id jdk-10+36 This should replicate the run that reproduced the failure with the same args and binary. It did not fail in 110 runs. The test is pretty basic in that it is just using the dcmd test framework to make sure an invalid command produces the proper output. This is basically it: OutputAnalyzer output = executor.execute("asdf"); output.shouldContain("Unknown diagnostic command"); It looks like the "asdf" dcmd is being executed, because as noted the following appears in stderr: java.lang.IllegalArgumentException: Unknown diagnostic command: asdf at jdk.test.lib.dcmd.JMXExecutor.executeImpl(JMXExecutor.java:105) at jdk.test.lib.dcmd.CommandExecutor.execute(CommandExecutor.java:61) at jdk.test.lib.dcmd.CommandExecutor.execute(CommandExecutor.java:43) at InvalidCommandTest.run(InvalidCommandTest.java:45) at InvalidCommandTest.jmx(InvalidCommandTest.java:66) This is normal and what the tests does when it passes also. However, it appears that at some point after this the test got stuck and timed out. It's unclear if it even returned from execute(), or if it got stuck later, perhaps during test shutdown and cleanup. I suggest closing as CNR for now, but lets keep an eye out for this type of dcmd timeout on windows.
03-01-2018

Link to log: http://java-dev.se.oracle.com:10061/api/v1/results/5f16128d-9736-446e-af6b-d568aa3cad8d/log to results http://java-dev.se.oracle.com:10067/mdash/jobs/lmesnik-hs-20171215-2313-7830/tasks/CqFi5NAZwc/results?query=status:overview Mach5 points to line: java.lang.IllegalArgumentException: Unknown diagnostic command: asdf but it seems to be expected output in the testcase which validate results. This exception just confuse mach5.
22-12-2017