JDK-8189497 : Improve docker container detection and resource configuration usage
  • Type: CSR
  • Component: hotspot
  • Sub-Component: runtime
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 10
  • Submitted: 2017-10-17
  • Updated: 2017-11-17
  • Resolved: 2017-11-16
Related Reports
CSR :  
Description
Summary
-------

When available, on Linux default to using the cgroup proc file system to determine CPU and memory configuration information in HotSpot; provide a flag to revert to the previous behavior of using traditional system calls. Add a cross-platform flag to set the active processor count and other supporting flags.

Problem
-------

Container support in Java needs improvement.

Solution
--------

This change adds the following JVM options:  -XX:ActiveProcessorCount, -XX:-UseContainerSupport and -Xlog:os+container.  It will also deprecate the experimental -XX:+UseCGroupMemoryLimitForHeap option. The -XX:-UseContainerSupport option will disable the new functionality that extracts cpu and memory information from the cgroup file system and revert the behavior back to using Linux system calls for this information.  The -XX:ActiveProcessorCount option allows a user to override the number of processors the VM will use when creating threads for various subsystems.  This option is available on all currently supported operating systems.  The -XX:+UseCGroupMemoryLimitforHeap option is being deprecated with this change since the new cgroup detection logic subsumes this functionality.  Finally, this change adds some additional logging (-Xlog:os+container) in order to help determine the decisions that are being made based on the contents of the cgroup file system files.

Specification
-------------

Add the following Hotspot VM command line options.

-XX:-UseContainerSupport (Linux only)

This flag will allow the new container support to be disabled in order to revert to the previous behavior.

-Xlog:os+container=logLevel

The container tag is being added to -Xlog:os in order to allow the VM to report container 
information.

-XX:ActiveProcessorCount=xx (Supported on all platforms)

This new flag allows a developer to manually specify how many processors a JVM running in a container (or on a host) will use.

Deprecate the following experimental option in JDK 10 and remove it in JDK 11.  
This feature is obsoleted by the new Container support.

-XX:+UseCGroupMemoryLimitForHeap (Linux only)

The full set of changes can be viewed here:

http://cr.openjdk.java.net/~bobv/8146115/webrev.03


Comments
In response to the previous comment, I enhanced the summary and solution sections. I thought that the specification section was supposed to document concise user visible changes which are further detailed in the solution section. As a result, I only listed the options that are changing and a brief description. Please let me know if I need to copy some of the contents of the solution into the specification section.
17-11-2017

Joe: there is no command-line help for -XX options. There is a single short-line summary that can be seen using the develop flag -XX:+PrintFlagsWithComments. Of course there will be a release note as well.
16-11-2017

David, that is fine in the solution section .The specification section which presumably provides the text the user would see as help on the command line currently says "-XX:-UseContainerSupport (Linux only) This flag will allow the new container support to be disabled in order to revert to the previous behavior." which I find uninformative as what the flag actually does.
16-11-2017

Joe: the solution section already states "The -XX:-UseContainerSupport option will disable the new functionality that extracts cpu and memory information from the cgroup file system and revert the behavior back to using Linux system calls for this information."
16-11-2017

Provided more concise summary. Repeating the comment from 2017-10-30 "From the information provided, it is not clear what -XX:-UseContainerSupport actually does, meaning, what semantics does it have? Please add this information before the request is finalized." Moving to approved on the condition that more informative description of UseContainerSupport be used, something like "if true, determine memory and CPU information from cgroups; otherwise, traditional system calls."
16-11-2017

Setting Interface kind and Scope fields.
31-10-2017

Please change the Summary and Solution sections to be more informative and stand-alone. For example, for the summary something like "Add a cross-platform flag to set the number of processors for the JVM to use and add Linux-specific flags to enable container support..." From the information provided, it is not clear what -XX:-UseContainerSupport actually does, meaning, what semantics does it have? Please add this information before the request is finalized. Moving to provisional state.
31-10-2017

Requesting troubleshooting guide as well as release notes since we are proposing changing the default setting which will potentially change heap allocation and GC/Compiler thread counts for existing applications. For users with accurate cgroup settings this should be an improvement. The new flag allows reverting to the old behavior, other flags allow more selective tuning, such as ActiveProcessorCount. Thank you for testing with a variety of systems with and without containers and with a variety of accurate and inaccurate group settings and with a variety of GCs and other GC flag settings.
27-10-2017

Okay - I withdraw my concerns. Let's proceed. Thanks.
27-10-2017

I have attempted to implement a very defensive solution that will revert to the previous value if any of the files or subsystems are not found where they are expected. I have found Linux systems that have varying mount points, different names for the cpu and cpuacct subsystems and systems that don't have quota or period files at all but support shares. As you've seen with your testing I've covered all of the cases you've seen. Given the increase usage of docker in the enterprise, I think this feature should be enabled by default so Java can config itself properly. Also given the increased number of Java releases we will have each year, we'll have more opportunities to react to feedback and improve this support. If we don't enable this support, it will only get limited use.
25-10-2017

The concern is that we don't know how cgroups may be set up, whether they are set up properly, or how they will affect execution of the VM out-of-the-box. I ran through JPRT with logging enabled and found a range of different "default" settings for our systems - none of which are actively configured to use cgroups: (sorry markdown seems to be completely broken by the logging statements below and I can't get them to format in any readable way). - no cgroup filesystem at all - no cpu info [0.001s][trace][os,container] OSContainer::init: Initializing Container Support [0.002s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory//memory.limit_in_bytes [0.002s][trace][os,container] Memory Limit is: 9223372036854775807 [0.002s][trace][os,container] Memory Limit is: Unlimited [0.002s][trace][os ] active_processor_count: using static path - configured processors: 4 [0.002s][trace][os ] active_processor_count: sched_getaffinity processor count: 4 [0.002s][debug][os,container] Error reading /cpu.shares [0.002s][debug][os,container] Error reading /cpu.cfs_quota_us [0.002s][debug][os,container] Error reading /cpu.cfs_period_us [0.002s][trace][os,container] OSContainer::active_processor_count: 4 [0.002s][trace][os ] active_processor_count: determined by OSContainer: 4 - cpu share value only [0.002s][trace][os,container] OSContainer::init: Initializing Container Support [0.003s][trace][os,container] Path to /memory.limit_in_bytes is /cgroup/memory//memory.limit_in_bytes [0.003s][trace][os,container] Memory Limit is: 9223372036854775807 [0.004s][trace][os,container] Memory Limit is: Unlimited [0.004s][trace][os ] active_processor_count: using static path - configured processors: 4 [0.004s][trace][os ] active_processor_count: sched_getaffinity processor count: 4 [0.004s][trace][os,container] Path to /cpu.shares is /cgroup/cpu//cpu.shares [0.005s][trace][os,container] CPU Shares is: 1024 [0.005s][trace][os,container] Path to /cpu.cfs_quota_us is /cgroup/cpu//cpu.cfs_quota_us [0.005s][debug][os,container] file not found /cgroup/cpu//cpu.cfs_quota_us [0.005s][debug][os,container] Error reading /cpu.cfs_quota_us [0.005s][trace][os,container] Path to /cpu.cfs_period_us is /cgroup/cpu//cpu.cfs_period_us [0.006s][debug][os,container] file not found /cgroup/cpu//cpu.cfs_period_us [0.006s][debug][os,container] Error reading /cpu.cfs_period_us [0.006s][trace][os,container] OSContainer::active_processor_count: 4 [0.006s][trace][os ] active_processor_count: determined by OSContainer: 4 - all cpu values with a "no op" setting [0.001s][trace][os,container] OSContainer::init: Initializing Container Support [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /cgroup/memory//memory.limit_in_bytes [0.001s][trace][os,container] Memory Limit is: 9223372036854775807 [0.001s][trace][os,container] Memory Limit is: Unlimited [0.001s][trace][os ] active_processor_count: using static path - configured processors: 4 [0.001s][trace][os ] active_processor_count: sched_getaffinity processor count: 4 [0.001s][trace][os,container] Path to /cpu.shares is /cgroup/cpu//cpu.shares [0.001s][trace][os,container] CPU Shares is: 1024 [0.001s][trace][os,container] Path to /cpu.cfs_quota_us is /cgroup/cpu//cpu.cfs_quota_us [0.001s][trace][os,container] CPU Quota is: -1 [0.001s][trace][os,container] Path to /cpu.cfs_period_us is /cgroup/cpu//cpu.cfs_period_us [0.001s][trace][os,container] CPU Period is: 100000 [0.001s][trace][os,container] OSContainer::active_processor_count: 4 [0.001s][trace][os ] active_processor_count: determined by OSContainer: 4 The end result was fine in all of the above, but highlights how differently things can be configured on systems. I simply don't trust that things are always going to be set up in a way that will work right out-of-the-box. Opt-in is always the safest and most conservative initial approach. For example you have to opt-in to UseNUMA.
24-10-2017

Can you please provide more specific concerns. I am picking up the cgroup configuration from the active /proc/self/mountinfo and /proc/self/cgroup files. If cgroups is enabled, these files will provide up to date mount point information which provides me with accurate locations of the subsystem files. In addition to docker, there are several utilities that can create and alter groups (numactl, cgset, cgcreate, cgset, cgexec, etc) but they all update file available through my lookup mechanism. Take a look at "man cgconfig.conf" to see how an administrator configures the default host cgroup file system. This conf file is of course only useful if cgroups has been initiated at boot up.
24-10-2017

I still have concerns with this being enabled by default. It's unclear to me how cgroups may be set up at a system level.
22-10-2017