JDK-8191369 : NMT: Enhance thread stack tracking
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 10
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux,windows
  • Submitted: 2017-11-15
  • Updated: 2020-07-17
  • Resolved: 2018-03-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b04Fixed
Related Reports
Relates :  
Relates :  
Description
From very beginning, thread stack was tracked when thread is created and counts the max stack size as reserved and committed. It served its purpose as NMT was initial developed for detecting memory leaks, etc.

With growing popularity in container in recent years, people started to use NMT to study JVM memory and turn it accordingly to fit in memory constraints of container.

Now, thread stack tracking becomes sore point, as it overstates memory usage. In some cases, it results NMT to report committed memory > RSS.

This enhancement addresses above issue by querying actual committed stack sizes at NMT query time.

It addresses the issue on Linux first, as it is the platform mostly used to host docker containers. 

For now, current behavior is preserved for other platforms.  It can be implemented for other platforms if found it is useful also.


 
Comments
Hi Volker, It definitely belongs to a separate RFE, which can be implemented in similar way. Could you please file a new RFE? Thanks!
09-01-2018

This is actually a problem with ALL native memory tracked by NMT and allocated with mmap(). NMT reports RESERVED and COMMITTED memory. RESERVED corresponds to the virtual size (or "Size" as reported by pmap or in /proc/PID/smaps). However, COMMITTED does NOT correspond to the resident memory reported as "Rss" by pmap (or RSS in top). COMMITTED in the sense of HotSpot/NMT is mmaped memory with a protection other than PROT_NONE. But this does not mean that the corresponding pages are really mapped to physical pages (i.e. appear in the RSS set of tools like pmap/top/ps). For HotSpot on Linux, uncommitting means to remap that region with PROT_NONE. This allows the kernel to remove the physical pages of the previous mapping (and effectively decreases the RSS of the process). There are actually better ways to do this. We could either use madvise(MADV_DONTNEED) or madvise(MADV_FREE) which is available since kernel 4.5. But that's another story... For this issue I'd like to propose to either extend it to cover all native memory tracked by NMT or to open a new issue for covering all native memory tracked by NMT. I think NMT should report three values: 1. the virtually memory it has mapped (i.e. 'Size' in pmap output), 2. the memory it has committed (like now, subset of 1) 3. the part of the memory which is resident in physical RAM (i.e. 'Rss' in pmap output, subset of 2)
09-01-2018

Getting the total summary of committed memory wrong due to this seems the biggest issue. I can make the totals wrong in arbitrary ways by specifying -Xss300m for example on linux. RSS and NMT's committed total memory will be *very* different.
16-11-2017

AIX seems to have mincore spc'd the same as Solaris so also seem usable. https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.basetrf1/mincore.htm
16-11-2017

For OSX /BSD it seems mincore is not usable. For Windows it looks like VirtualQuery may be usable in place of mincore - assuming MEM_RESERVE reports unmapped memory the way we need: https://msdn.microsoft.com/en-us/library/windows/desktop/aa366902(v=vs.85).aspx https://msdn.microsoft.com/en-us/library/windows/desktop/aa366775(v=vs.85).aspx
16-11-2017

The same mincore based mechanism is usable on Solaris as well: https://docs.oracle.com/cd/E23823_01/html/816-5167/mincore-2.html
16-11-2017