JDK-8280770 : serviceability/sa/ClhsdbThreadContext.java sometimes fails with 'Thread "SteadyStateThread"' missing from stdout/stderr
  • Type: Bug
  • Component: hotspot
  • Sub-Component: svc-agent
  • Affected Version: 19
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-01-27
  • Updated: 2022-10-27
  • Resolved: 2022-02-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19
19 b08Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
Although it fails because it can't find SteadyStateThread in the output, the real issue is that it failed to dump the threaddcontext for the SteadyStateThread:

 + threadcontext -v 23
Couldn't find thread 23

The SteadyStateThread was determined to be thread 23 based on the earlier "threadcontext -a" output. It's unclear why it failed to find it. This is a new test (and feature).
Comments
This fix is integrated in jdk-19+8-427.
03-02-2022

Changeset: 5080e815 Author: Chris Plummer <cjplummer@openjdk.org> Date: 2022-02-01 15:59:35 +0000 URL: https://git.openjdk.java.net/jdk/commit/5080e815b4385751734054b5f889c4d89cfcdeb4
01-02-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/7259 Date: 2022-01-28 07:34:21 +0000
28-01-2022

One other thing to not that I can't explain is that I only see this issue when using -Xcomp plus some other options. The two that are reproducing it are: -Xcomp -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:+TieredCompilation -XX:+DeoptimizeALot -Xcomp -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:+TieredCompilation -XX:+VerifyOops Just using -Xcomp does not seem to reproduce the issue. Possibly different threads (that are not JavaThreads) get spun up or exit based on these option settings, and that causes the renumbering of some threads.
28-01-2022

My guess is that every time you re-attach the ID can change. The ID being used is what is known by MS as the "engine thread id". There is also the "system thread id", which is what SA stashes away as the sysId when first getting the list of threads. sysId is the same as OSThread::_thread_id. SA the maps the sysID to an "id", which is what MS calls the "engine thread id", by calling the MS GetThreadIdBySystemId API: class WindbgAMD64Thread implements ThreadProxy { private WindbgDebugger debugger; private long sysId; // SystemID for Windows thread, stored in OSThread::_thread_id private boolean gotID; private long id; // ThreadID for Windows thread, returned by GetThreadIdBySystemId This is the id that gets displayed when dumping threads, and the one that is specified when using "threadcontext <id>". The follow web page discusses the "engine thread id": https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/controlling-threads-and-processes Although it's not clearly stated, I seems that the "enging thread id" might not be the same each time the application is stopped by the debugger. It's hard to write clhsdb tests code that attaches, executes a clhsdb command, looks at the output, and then issues another clhsdb command based on what is seen in the output. The way all the tests currently work is they attach, run a clhsdb command, detach, and return the output. This is all done by making a single call into ClhsdbLauncher. So when multiple clhsdb commands are executed, the state of the target process is changing in between. My suggestion is to just not bother with the "threadcontext <tid>" testing when run on windows. I'm not seeing this issue on any other platform.
27-01-2022

Below is first the list of IDs printed by "threadcontext -a" followed by the list that "threadcontext <id>" iterated over (I added the spaces to make it easier to see the difference). 4 11 12 13 14 15 16 17 18 19 20 22 23 1 8 9 10 11 12 13 14 15 16 17 19 20 So there have been changes. 8, 9, and 10 are added and 18, 22, and 23 have been removed. I then modified the test to dump the name and ID of every thread when the error happens. This is what I got. Thread "main" id=1 Address=0x000001dc2c238690 Thread "Reference Handler" id=8 Address=0x000001dc4b48a0f0 Thread "Finalizer" id=9 Address=0x000001dc4b48e460 Thread "Signal Dispatcher" id=10 Address=0x000001dc4b4c7600 Thread "Attach Listener" id=11 Address=0x000001dc4b4c8fc0 Thread "Service Thread" id=12 Address=0x000001dc4b4ca280 Thread "Monitor Deflation Thread" id=13 Address=0x000001dc4b4ceb00 Thread "C2 CompilerThread0" id=14 Address=0x000001dc4b4d28d0 Thread "C1 CompilerThread0" id=15 Address=0x000001dc4b4d9d70 Thread "Sweeper thread" id=16 Address=0x000001dc4b4e4be0 Thread "Notification Thread" id=17 Address=0x000001dc508fdd20 Thread "Common-Cleaner" id=19 Address=0x000001dc51c0dfa0 Thread "SteadyStateThread" id=20 Address=0x000001dc51749420 When I compare this list with what got printed by "threadcontext -a", I can see that the list is the same, but IDs have changed. It's the same set of threads and in the same order, but with different IDs.
27-01-2022

The same SA code is used for threadcontext whether you specify -a or a tid: String id = t.nextToken(); Threads threads = VM.getVM().getThreads(); boolean all = id.equals("-a"); for (int i = 0; i < threads.getNumberOfThreads(); i++) { JavaThread thread = threads.getJavaThreadAt(i); ByteArrayOutputStream bos = new ByteArrayOutputStream(); thread.printThreadIDOn(new PrintStream(bos)); if (all || bos.toString().equals(id)) { out.format("Thread \"%s\" id=%s Address=%s\n", thread.getThreadName(), bos.toString(), thread.getAddress()); thread.printThreadContextOn(out, verbose); out.println(" "); if (!all) return; } } if (!all) { out.println("Couldn't find thread \"" + id + "\""); } The test executes threadcontext 3 times: threadcontext -a threadcontext -a -v threadcontext <tid> <tid> is the tid of the SteadyStateThread, which is gleaned from the output of the "treadcontext -a -v" command. Since the target VM is allowed to run between each command, it's possible that the same set of threads might not be printed each time, but the SteadyStateThread should not be exiting. I modified the above code to always print out the current thread it is looking at. In a passing case for the "threadcontext <tid>" command, you will see something like: hsdb> + threadcontext -v 23 4 11 12 13 14 15 16 17 18 19 20 22 23 Thread "SteadyStateThread" id=23 Address=0x0000019c1d845ec0 In one failing case I looked at I saw: hsdb> + threadcontext -v 23 1 8 9 10 11 12 13 14 15 16 17 19 20 Couldn't find thread "23" Note that thread 22 is also missing. It is the Common-Cleaner thread and was included in the previous "threadcontext -a -v" output. I don't think either of these threads have actually gone away, so it looks like for some reason either VM.getVM().getThreads() is not returnning all the threads, or threads.getNumberOfThreads() is not including the full count of threads.
27-01-2022