JDK-8274661 : Use system default for stack sizes
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 17
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • Submitted: 2021-10-01
  • Updated: 2021-10-04
  • Resolved: 2021-10-04
Related Reports
Relates :  
Description
We have 3 stack size related runtime flags:

ThreadStackSize
VMThreadStackSize
CompilerThreadStackSize

that have a "magical" value of 2048 set on aarch64 platforms and a value of 1024 on x86_64 (and 512 on 32 bit x86)

The values come from various practical approaches and might have to be updated with new OS, platforms, architectures, features, etc.

The current value of 2048 on Linux aarch64 comes from LargePage requirement (still trying to nail down more details here) https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2013-July/000092.html

macOS aarch64 doesn't (currently) support large page, so it probably would be OK with a lower value, but since it was a direct port based on Linux aarch64, it inherited those values.

More background:

In https://bugs.openjdk.java.net/browse/JDK-8140520 we had a problem with TestOptionsWithRanges.java that would crash (on Solaris) with a value of "1". A value of "1" does not really make sense here, but since "0" was allowed, as a special value meaning "use system default", then the test wouldn't know any better and would try the next larger number, and crash. To work around the issue, and since we didn't like "0" as a special value, we have bumped up the value to something we thought was reasonable.

According to http://xmlandmore.blogspot.com/2014/09/jdk-8-thread-stack-size-tuning.html stack sizes are allocated RAM, not just reserved, and they also needed to be zeroed out, which takes time, so we should keep them as small as possible.

If large pages is turned ON, then we should bump up the sizes on those platforms that support large pages.

Otherwise we should let the OS dictate the system default values.

Hardcoding them, as they are right now, means more maintenance and higher resource usage if they are not tuned (and they do not seem to be tuned)

We should revert back to using system defaults, and we should find a mechanism for specifying that, since we are not happy with using "0" as the default value.


Comments
I think that I agree. Less maintenance in this case is more important, than the performance. Those developers that wish to optimize using smaller stack sizes are able to do so, by using the provided runtime flags. The current values are the safest for the correct VM operation, which is of the highest priority. Is there a margin for improvement here? Sure, but we can take that up whenever we need to adjust the values the next time, given the resources available (would need startup performance measurement)
04-10-2021

What I disliked about zero as "use the default" was that it gave us a continuous range from zero to some maximum that wrongly implied that any non-zero value was valid, which was total nonsense. The valid range of those flags should be a disjoint set { 0, platform-specific-min - max} but the range checking logic doesn't handle that IIRC.. Whether or not we should use the platform defaults depends on what they are. In the past the defaults were deemed unsuitable and also variable meaning we would not know exactly what would get used on any given system. I'm not really seeing any bug here that needs fixing. Either we fix known values that may be sub-optimal on some platforms; or we use system defaults that both may be sub-optimal and also non-determinate a-priori. So current scheme scores -1 and proposed scheme -2, which argues for keeping things as-is.
04-10-2021