JDK-8132785 : java/lang/management/ThreadMXBean/ThreadLists.java fails intermittently
  • Type: Bug
  • Component: core-svc
  • Sub-Component: java.lang.management
  • Affected Version: 9,11,13,18,19
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-07-31
  • Updated: 2021-12-20
  • Resolved: 2021-12-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 18 JDK 19
18 b28Fixed 19Fixed
Related Reports
Relates :  
Sub Tasks
JDK-8278521 :  
Description
java/lang/management/ThreadMXBean/ThreadLists.java fails intermittently:

----------System.out:(4/127)----------
ThreadGroup: 8 active thread(s)
Thread: 8 stack trace(s) returned
ThreadMXBean: 7 live threads(s)
ThreadMXBean: 7 thread Id(s)
----------System.err:(12/626)----------
java.lang.RuntimeException: inconsistent results
	at ThreadLists.main(ThreadLists.java:67)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:502)
	at com.sun.javatest.regtest.agent.MainActionHelper$SameVMRunnable.run(MainActionHelper.java:218)
	at java.lang.Thread.run(Thread.java:745)

JavaTest Message: Test threw exception: java.lang.RuntimeException
Comments
Changeset: c93b24d8 Author: David Holmes <dholmes@openjdk.org> Date: 2021-12-13 21:37:51 +0000 URL: https://git.openjdk.java.net/jdk18/commit/c93b24d85289167639e9ec4b79bd85403687161b
13-12-2021

Failures stopped happening for a while but I finally got this one: Starting Test Initial set of enumerated threads: - Thread: Reference Handler - Thread: Finalizer - Thread: Signal Dispatcher - Thread: Notification Thread - Thread: process reaper - Thread: main - Thread: pool-1-thread-1 - Thread: JFR Periodic Tasks - Thread: MyThread - Thread: AgentVMThread - Thread: Common-Cleaner ThreadGroup: 11 active thread(s) Thread: 12 stack trace(s) returned ThreadMXBean: 12 live threads(s) ThreadMXBean: 12 thread Id(s) Final set of enumerated threads: - Thread: Reference Handler - Thread: Finalizer - Thread: Signal Dispatcher - Thread: Notification Thread - Thread: process reaper - Thread: main - Thread: pool-1-thread-1 - Thread: JFR Periodic Tasks - Thread: MyThread - Thread: AgentVMThread - Thread: Common-Cleaner The bad news is that the final set of threads enumerated is the same as the first, which means either the mystery thread is not visible via the TG hierarchy, or it was created but then terminated before we could sample it. The good news is that we now see 12 threads in total with the extra thread being "MyThread" - and that is not a thread used by this test so we have direct evidence that other tests are affecting the execution of this one. So reiterating the simple fix: run in othervm mode.
13-12-2021

I suspect it is more about what other tests have run, but it unclear - the typical problem this test has had in the past is threads terminating in the background. It is unusual to see cases where unexpected thread creation is occurring. I'm currently running an expanded check to print out the threads when we fail to see if I can trap it that way. Regardless the fix seems simple enough.
10-12-2021

I can't explain the JFR thread, but do you think the jtreg test infrastructure might itself be playing some role here since it too creates some threads. The jtreg agent vm framework seems to use a specific thread group[1]. Maybe printing the thread group name and the already collected stacktraces (in the "stackTraces" variable in that test case) might give some hint on where these threads are coming from? I also wonder if printing the "top.activeCount()" after these checks, in that test, would also show an increase in the active count or would it still be 10. [1] https://github.com/openjdk/jtreg/blob/81651f4bb3f00f0f353372f75357f93bd539fd5e/src/share/classes/com/sun/javatest/regtest/agent/MainActionHelper.java#L189
10-12-2021

A failing case showed up: Thread: Reference Handler Thread: Finalizer Thread: Signal Dispatcher Thread: Notification Thread Thread: process reaper Thread: main Thread: pool-1-thread-1 Thread: JFR Periodic Tasks Thread: AgentVMThread Thread: Common-Cleaner ThreadGroup: 10 active thread(s) Thread: 11 stack trace(s) returned ThreadMXBean: 11 live threads(s) ThreadMXBean: 11 thread Id(s) No clue to the new 11th thread but now we have the JFR Periodic task thread present as well! This test doesn't use JFR so I have to assume that is an artifact of running other tests in the same agentvm.
10-12-2021

No failures reproducing so far, but I did spot a further change: Thread: Reference Handler Thread: Finalizer Thread: Signal Dispatcher Thread: Notification Thread Thread: process reaper Thread: main Thread: pool-1-thread-1 Thread: AgentVMThread Thread: Common-Cleaner so now we have nine threads due to the "Process reaper". I think the simplest/safest thing to do with this test, and potentially similar ones, is to run them as othervm so that the thread set will always be consistent.
10-12-2021

When I run the test locally I see 7 threads: Thread: Reference Handler Thread: Finalizer Thread: Signal Dispatcher Thread: Notification Thread Thread: main Thread: MainThread Thread: Common-Cleaner but when I run in mach5 I see 8 threads: Thread: Reference Handler Thread: Finalizer Thread: Signal Dispatcher Thread: Notification Thread Thread: main Thread: pool-1-thread-1 Thread: AgentVMThread Thread: Common-Cleaner This difference is due to running jtreg in othervm mode by default locally, but using agentvm mode when run in mach5. I ran just this test in the tier3 configuration multiple times and saw zero failures - but I also only ever saw 8 threads, not the 10/11 that cause the test failure. So it appears that because of agentvm mode, this test could be affected by what other tests have run before it (though you would expect threads to disappear rather than get started as the test progresses). I will re-run the augmented test in the normal tier3 mode to try to gather more data.
10-12-2021

I've assigned to myself to investigate. Updating the test is tricky because it appears a new thread is created after we calculate activeCount but before the other checks (or else activeCount is mis-counting?). But if we enumerate all the threads after that check we may now see the extra one, and may be able to deduce which one it is likely to be.
09-12-2021

It would probably be useful to add a sub-task to update the test to actually display information about the unexpected threads...
09-12-2021

Bumped to P3 due to CI noise.
09-12-2021

This is now a regular recurring failure in our CI testing but I am not at all sure what changed. I can run this locally without a problem, but the interesting thing is that locally I see: Starting Test ThreadGroup: 7 active thread(s) Thread: 7 stack trace(s) returned ThreadMXBean: 7 live threads(s) ThreadMXBean: 7 thread Id(s) whereas the test always fails with: Starting Test ThreadGroup: 10 active thread(s) Thread: 11 stack trace(s) returned ThreadMXBean: 11 live threads(s) ThreadMXBean: 11 thread Id(s) so what are those four additional threads? I would have to expect there is some lazy thread initialization going on somewhere ... perhaps in relation to the ForkJoin common pool (as I can't think of anywhere else we may be creating threads). But why this fails in the CI but not locally is something of a mystery to me.
09-12-2021

This test seems to have started failing again in recent days, but only once in the CI. Not clear if something has changed in test environment that might impact this.
09-12-2021

ILW = M (noise in test results) L (same binary, 1 in 809 runs) H (no) = P4
04-08-2015