Bug ID: JDK-8227254 Optionally preallocate buffer to make error handling more stable on native OOM

Type: Enhancement
Component: hotspot
Sub-Component: runtime
Affected Version: 14

Priority: P4
Status: Resolved
Resolution: Won't Fix

Submitted: 2019-07-04
Updated: 2022-06-24
Resolved: 2022-06-24

A number of error reporting steps require memory. When we are out of memory, those steps may fail.

A prominent example is NMT: NMT detailed reports would be super helpful in analyzing native OOMs (see also: JDK-8227031). However, to create that report NMT needs memory from C-Heap. If C-Heap is exhausted, report fails or crashes. 

Other examples include stacktrace printing, which on some platforms invokes ElfDecoder to print symbol names, which needs C-Heap as well.

In a perfect world we would harden everything destined to be running inside error handling, e.g. to work with pre-allocated buffers, or to not allocate memory at all. But that is difficult (not impossible) and increases complexity, which is undesired.

A pragmatic solution would be:

- allocate a "ballast" buffer from C-Heap on VM startup
- on native OOM, "drop" that ballast by free()ing the buffer. This returns the memory back to the C-lib allocator, reduces the memory pressure and hopefully enables the few error reporting steps which need C-Heap to finish successfully.

Of course there is absolutely no guarantee that this works - code running concurrently may gobble the memory up the instant we release it, for instance.
 
However, this is pragmatic and dead simple to implement and in practice works surprisingly well. A hack such as this did often help us to get an NMT detail report in an OOM situation where otherwise we would have gotten nothing.

I had played around with different solutions in the past but did not find a solution I was content with. The problem is that allocating a ballast buffer is not a so clear cut solution as I thought. Whether or not the buffer is useful depends on the implementation of the libc allocator (whether it is able to re-use the free(3)'d balloon buffer for subsequent allocation of the error handler). The more boring but ultimately cleaner solution is to stay with our scratch buffer technique, and to make sure all routines that report on error reporting use that scratch buffer for temporary allocations.

24-06-2022

Relates :	JDK-8227031 - Print NMT statistics on fatal errors
Relates :	JDK-8227072 - NMT detailed report should work in native OOM situations.