JDK-8268773 : Improvements related to: Failed to start thread - pthread_create failed (EAGAIN)
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 18
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2021-06-15
  • Updated: 2022-07-18
  • Resolved: 2021-07-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17 JDK 18
11.0.16Fixed 17.0.4Fixed 18 b07Fixed
Related Reports
Relates :  
Relates :  
Description
Extracted from https://bugs.openjdk.java.net/browse/JDK-8268605

The current warning message:
[11.028s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 2048k, guardsize: 0k, detached.

is quite unhelpful and does not give an idea which thread failed to be created.
It would be a useful addition to show if this is a GC thread, a compiler thread, an application thread, etc.

> One more thing, do you think it would be reasonable to retry (e.g. 1-2 times) pthread_create() if it returns EAGAIN?

That is not unreasonable, but may not help depending on exactly why the failure is occurring. If we have hit the ulimit process/thread limit for example, then it won't self-correct unless a thread/process has terminated since the first call. So in a tight loop it may just fail continually.
Comments
jdk11u-fix-request On behalf of Basil Crow: Backporting openjdk/jdk@e35005d shows a very significant reduction of the number of occurrences of JENKINS-65873 (0.04% instead of 0.2% over 5000-6000 builds). The patch applies after resolving a trivial merge conflict in thread.hpp and dropping the change to Thread::is_JavaThread_protected in thread.cpp (because Thread::is_JavaThread_protected does not exist on jdk11u-dev).
07-05-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk11u-dev/pull/1074 Date: 2022-05-07 07:43:08 +0000
07-05-2022

jdk17u-fix-request on behalf of Basil Crowe: Backporting https://github.com/openjdk/jdk/commit/e35005d5ce383ddd108096a3079b17cb0bcf76f1 shows a very significant reduction of the number of occurrences of [JENKINS-65873](https://issues.jenkins.io/browse/JENKINS-65873) (0.04% instead of 0.2% over 5000-6000 builds). The patch applies almost cleanly (trivial merge conflict in `thread.hpp`). https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2022-May/014222.html
07-05-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk17u-dev/pull/390 Date: 2022-05-07 07:07:06 +0000
07-05-2022

Changeset: e35005d5 Author: David Holmes <dholmes@openjdk.org> Date: 2021-07-16 02:49:40 +0000 URL: https://git.openjdk.java.net/jdk/commit/e35005d5ce383ddd108096a3079b17cb0bcf76f1
16-07-2021

After discussion with [~iklam] I'm going to look at adding additional logging at a higher-level where we have more information available for application Java threads.
15-07-2021

public class ThreadExhaustion { public static void main(String[] args) throws InterruptedException { Thread t = new Thread(() -> { try { while (true) { new Thread(() -> { try { Thread.currentThread().join(); } catch (InterruptedException e) { } }).start(); } } catch (OutOfMemoryError oome) { Thread.currentThread().getThreadGroup().interrupt(); } }); t.start(); t.join(); } }
02-07-2021

Worst case it's "unknown thread" (not worse than current) but best case it's actually useful information, so sounds good to me. > when I ran a simple thread exhaustion test Could you share that test? My current reproducer is unfortunately much more involved, so it would be useful to be able to reproduce it more easily/reliably, and e.g., try out the various Xlog options to see their effect there.
01-07-2021

We have no way to know if the correct name of a thread is available - we simply print thread->name() - which means in many cases it will just print "unknown thread", including for some VM threads. We don't need to pass the name to os::create_thread as we already pass the Thread from where we get the name. I must confess though that when I ran a simple thread exhaustion test, expecting to see my calls to t.start() failing due to this resource problem, the updated warning showed me that it was actually a GC thread that hit the failure. So perhaps there is some value afterall. I will put out the PR for this to see what others think.
01-07-2021

Could the name be shown if available? I understand it might be tricky for java.lang.Thread, but those would anyway throw if it fails. But for internal threads it seems quite useful to show their name, and there it sounds like it should always be possible to pass a name as an argument to os::create_thread or so.
30-06-2021

[~bdaloze] it turns out that printing the thread name is also not generally useful, because the main threads of interest do not have their names set until after this code has been executed. The pthread_create is called from os::create_thread which is called e.g. from the JavaThread constructor. So any JavaThread's (including CompilerThreads) will not have a name. Only singleton-instance thread types (VMThread, WatcherThread) and those that explicitly set the name in their constructor (most GC threads) will have a name - and failure to create those threads is either a fatal condition that will abort the VM, or else of little general interest. Given neither of the "improvements" here are actually likely to be of use I'm inclined to just close this RFE as "will not fix".
30-06-2021