Hi,
when the creation of an OS thread fails, it is usually very hard to determine why. Depending on the OS it could be hitting one of the limits (e.g. /proc/sys/kernel/threads-max, RLIMIT_NPROC for the user, a container limitation or a few other limits including memory limits).
Currently we log a warning to os/thread including the errno message and the requested stack/guard sizes. The output looks like this:
Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.
The proposal is to trace additional relevant limits in this case. In order to keep the additional code needed small, we would reuse methods already used in printing the limits to a hs_err* file.
In case of Linux (where we have the most relevant limits and by far the longest output), this could look like this:
[19.430s][warning][os,thread] Nr of threads approx. running in the VM: 9555
[19.430s][warning][os,thread] rlimit: STACK 8192k, CORE 0k, NPROC 10000, NOFILE 4096, AS infinity, DATA infinity, FSIZE infinity
[19.431s][warning][os,thread] Memory: 4k page, physical 16401984k(8167360k free), swap 2097148k(2097148k free)
[19.431s][warning][os,thread]
[19.431s][warning][os,thread] /proc/sys/kernel/threads-max (system-wide limit on the number of threads):
[19.431s][warning][os,thread] 127667
[19.431s][warning][os,thread]
[19.431s][warning][os,thread]
[19.431s][warning][os,thread] /proc/sys/vm/max_map_count (maximum number of memory map areas a process may have):
[19.431s][warning][os,thread] 65530
[19.431s][warning][os,thread]
[19.431s][warning][os,thread]
[19.431s][warning][os,thread] /proc/sys/kernel/pid_max (system-wide limit on number of process identifiers):
[19.431s][warning][os,thread] 131072
[19.431s][warning][os,thread]
[19.431s][warning][os,thread]
[19.431s][warning][os,thread] container (cgroup) information:
[19.431s][warning][os,thread] container_type: cgroupv1
[19.431s][warning][os,thread] cpu_cpuset_cpus: 0-7
[19.431s][warning][os,thread] cpu_memory_nodes: 0
[19.431s][warning][os,thread] active_processor_count: 8
[19.431s][warning][os,thread] cpu_quota: -1
[19.431s][warning][os,thread] cpu_period: 100000
[19.431s][warning][os,thread] cpu_shares: -1
[19.431s][warning][os,thread] memory_limit_in_bytes: -1
[19.431s][warning][os,thread] memory_and_swap_limit_in_bytes: -2
[19.431s][warning][os,thread] memory_soft_limit_in_bytes: -1
[19.431s][warning][os,thread] memory_usage_in_bytes: 6902476800
[19.431s][warning][os,thread] memory_max_usage_in_bytes: 0
Best regards,
Ralf