Bug ID: JDK-8255978 [windows] os::release_memory may not release the full range

Type: Bug
Component: hotspot
Sub-Component: runtime
Affected Version: 8,11,15,16

Priority: P2
Status: Closed
Resolution: Fixed
OS: windows

Submitted: 2020-11-06
Updated: 2024-12-20
Resolved: 2020-11-19

JDK 16
16 b26Fixed

On Windows, os::release_memory(p,size) may not actually release the whole region if it contains multiple mappings. This may cause memory bloat or runaway leaks or errors which look like failed mappings to specific wish addresses.

Background: 

On Windows, memory mappings are established with VirtualAlloc() [1] and released with VirtualFree() [2]. In constrast to POSIX munmap(), VirtualFree() can only release a single full range established with VirtualAlloc(). It cannot release multiple ranges, or parts of a range.

The Windows implementation of os::release_memory(p, size) [3] calls VirtualFree(p, NULL, MEM_RELEASE) - it ignores the size parameter and releases whatever mapping happens to start at p:

```
bool os::pd_release_memory(char* addr, size_t bytes) {
  return VirtualFree(addr, 0, MEM_RELEASE) != 0;
}
```

... which assumes that the given range size corresponds to the size of a mapping starting at p.

This may be incorrect:

1) For NUMA-friendly allocation, we allocate memory in stripes, each stripe individually allocated.
2) For +UseLargePagesIndividualAllocation we do the same
3) apart from that, the given region size may just be wrong. Since we never check these, we may never have noticed. I am currently running tests to find out if we have other mismatched releases.

For cases (1) and (2), we would just release the first stripe in that striped range, leaving the rest of the mappings intact. This is not immediately noticeable, since VirtualFree() returns success. And even if it did not, we usually ignore the return code of os::release_memory().

The problem is aggrevated since, on Windows, we often employ an "optimistically-release-and-remap" approach: since mappings are undivisible, if one wants to change their size, split them or similar, one has to follow this sequence:

a) release old allocation
b) place into the now vacated address room one or more new allocations

This is not guaranteed to work, since between (a) and (b) someone may have grabbed the address space. We live with that since there is no way to do this differently.

When used on a range which contains multiple mappings, this technique is almost guaranteed to fail. In that case, (a) would only release the first mapping in the range. (b) would almost certainly fail since most of the original range would still be mapped.

Examples of these technique in os_windows.cpp:
- os::split_reseved_memory() (see also [4])
- map_or_reserve_memory_aligned()
- os::replace_existing_mapping_with_file_mapping()

This can manifest as small memory leak or inability to attach to a given wish address. It could also result in a viscous loop ([5], [6]) and result in ballooning and native OOMs.

--

Solution would be to change os::release_memory() to use VirtualQuery to query the mappings in that range and release them individually. We should this only for cases where we know multi-map reservations can exist, e.g. NUMA or LP. Otherwise we should assert (guarantee?) that the range given to os::release_memory() has an exact match at the OS level.

--

AFAICS this is an old issue, dating back to at least jdk 8.

--
[1] https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualalloc
[2] https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-virtualfree
[3] https://github.com/openjdk/jdk/blob/5dfb42fc68099278cbc98e25fb8a91fd957c12e2/src/hotspot/os/windows/os_windows.cpp#L3394
[4] https://bugs.openjdk.java.net/browse/JDK-8253649
[5] https://github.com/openjdk/jdk/blob/5dfb42fc68099278cbc98e25fb8a91fd957c12e2/src/hotspot/os/windows/os_windows.cpp#L3150
[6] https://bugs.openjdk.java.net/browse/JDK-8255954

Seems this test added the release_one_mapping_multi_commits_vm tests to test_os.cpp. The test release_one_mapping_multi_commits_vm fails for some time sporadically. Especially we see the failures on Linux aarch64 (see below) and also Linux ppc64le. stderr : java.lang.AssertionError: gtest execution failed; exit code = 2. the failed tests: [os::release_one_mapping_multi_commits_vm] at GTestWrapper.main(GTestWrapper.java:98) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333) at java.base/java.lang.Thread.run(Thread.java:1570)
26-10-2023
Changeset: f626ed6a Author: Thomas Stuefe <stuefe@openjdk.org> Date: 2020-11-19 11:51:09 +0000 URL: https://github.com/openjdk/jdk/commit/f626ed6a
19-11-2020
I raise this to P2 since Oracle P2'd the duplicate of this (https://bugs.openjdk.java.net/browse/JDK-8255954)
13-11-2020
Needs a second reviewer
13-11-2020
ILW = MMM = P3
10-11-2020

Duplicate :	JDK-8255954 - [windows] UseNUMAInterleaving causes VM to balloon and hang
Relates :	JDK-8256287 - [windows] add loop fuse to map_or_reserve_memory_aligned
Relates :	JDK-8257041 - [aix] os::release_memory may not release the full range
Relates :	JDK-8253649 - Potential bug in os::split_reserved_memory on windows
Relates :	JDK-8280940 - gtest os.release_multi_mappings_vm is racy
Relates :	JDK-8255917 - runtime/cds/SharedBaseAddress.java failed "assert(reserved_rgn != 0LL) failed: No reserved region"