JDK-8027434 : "-XX:OnOutOfMemoryError" uses fork instead of vfork
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 7u40,8
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • OS: linux
  • Submitted: 2013-10-11
  • Updated: 2019-11-15
  • Resolved: 2018-10-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 12 JDK 8 Other
11.0.2Fixed 12 b16Fixed 8u202Fixed openjdk8u212Fixed
Related Reports
Relates :  
Description
FULL PRODUCT VERSION :


FULL OS VERSION :
Does not matter.
(Seen on Linux tombot.sirrix.de 3.2.0-53-generic #81-Ubuntu SMP Thu Aug 22 21:01:03 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux)

EXTRA RELEVANT SYSTEM CONFIGURATION :
The VM is configured to use 80% of the systems memory using -Xmx and -Xms. Kernel option vm.overcommit_ratio = 100, vm.overcommit_memory = 2

A DESCRIPTION OF THE PROBLEM :
When calling fork/exec the process is duplicated by fork and only then the new copy is replaces usding exec. That leads to problems, when the JVM uses most of the memory and only a small programm is to be started.

The problem was solved for Process.start in 2009 (see link below), but was apparently not solved for execution of the "-XX:OnOutOfMemoryError" handler.

-XX:OnOutOfMemoryError=reboot did not work, even though the log showed it would be started. Looking in the openJDK 7u40 sources, I found that this code still uses fork/exec instead of vfork, which was the solution for the Process.start case....

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7034935

THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Configure the vm to use most of the system memory.
Disable overcommit in linux.
Start the JRE using -XX:OnOutOfMemoryError=reboot
Allocate memory in a loop until an OnOutOfMemoryExecption occurs.


EXPECTED VERSUS ACTUAL BEHAVIOR :
Expected: Reboot of the server

Actual: The log shows a message about -XX:OnOutOfMemoryError=reboot and starting reboot, but it does not reboot.
REPRODUCIBILITY :
This bug can be reproduced always.
Comments
This is an Oracle BPR backport into 8u202 that will be in Oracle 8u212. It also shipped in 11.0.2. The patch is less risky than it looks: it affects only Linux, and only the OOM case.
22-02-2019

I can think of no reason why this should be applied to jdk8u now. This looks like a nontrivial change that is not risk-free for a fairly unusual use case.
22-02-2019

11u Fix Request: This is a clean back-port for 11. Mach5 hs-tier1-4 testing are run and are clean.
24-10-2018

The JDK ProcessImpl fork-and-exec functionality is both more complicated than what is needed in the VM,and also simpler, so the pros and cons discussed there do not necessarily match the pros and cons for os::fork_and_exec in the VM. The ProcessImpl functionality is more complicated in that it has to deal with a number of issues relating to inherited file descriptors and close-on-exec settings. The VM does not need to care about this. But the ProcessImpl functionality is also simpler in that it is only ever executed for a normal execution context and so does not have to worry about whether the mechanism is async-signal-safe. The VM needs something that is signal-safe (or at least safe enough to be useful most of the time - the VMError reporting mechanism is not itself signal-safe). I wonder whether a shared POSIX implementation of os::fork_and_exec could use posix_spawn with a fall back to fork()+execve() if executing in a signal-handling context? The signal safety of vfork seems unclear. The simple switch from fork to vfork has been experimented with and seems to work for the non-signal OnOutOfMemoryError case. We need to see what happens in the OnError case. Though we could also select based on the signal context.
17-08-2018

Shafi: your test is not exhausting native heap it is using up all the allowed number of process/threads that can be created (ulimit -u) - so naturally the attempt to exec a new process fails. The vfork solution deals with actual memory use.
11-04-2017

Hi, As the native heap is exhausted or not sufficient, we are not able to create a child process with the vfork/clone/posix_spawn. So this bug cannot be fixed. I tried the above change with below test case and I am getting same error. //Test.java class Test { public static void main(String[] args) throws InterruptedException { Thread t = new Thread(() -> { while (true) { new Thread(() -> { try { Thread.currentThread().join(); } catch (InterruptedException e) { } }).start(); } }); . t.start(); Thread.currentThread().join(); } } java -XX:OnOutOfMemoryError="kill -9 %p" Test [2.426s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached. # # java.lang.OutOfMemoryError: unable to create native thread: possibly out of . memory or process/resource limits reached # -XX:OnOutOfMemoryError="kill -9 %p" # Executing /bin/sh -c "kill -9 2735"... os::fork_and_exec failed: Resource unavailable, try again (EAGAIN=11) Exception in thread "Thread-0" java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(java.base/Native Method) at java.lang.Thread.start(java.base/Thread.java:812) at Test.lambda$main$1(Test.java:13) at java.lang.Thread.run(java.base/Thread.java:843) Regards, Shafi
11-04-2017

Reopening this based on email from external party: It's not *optimizing memory usage*, it's making -XX:OnOutOfMemoryError= actually *work* in real-world scenarios instead of *fail*! The current implementation of -XX:OnOutOfMemoryError= makes it *useless*, as the fork will not succeed if the system does not have the same extra amout of used-up heap space available. Example: We have lots of app-servers that have 16 GB RAM and some swap (swap is evil anyway). The JBoss JVM is sized to use 12 GB heap space. When the heap space is used up, there is simply not enough memory available to fork! Effectively, the -XX:OnOutOfMemoryError is *useless* at that point, the command ("kill -0 %p") does not get executed. I've actually wasted over a dozen of hours to find that out. If you keep that bug closed, please remove the -XX:OnOutOfMemoryError= completely, it's just implemented broken.
05-05-2014

This occurrence of fork is during error handling, optimizing memory usage at this point is not a goal. The case for OnOutOfMemoryError is when you run out of Java heap and generally in that case there is enough native memory to process the request.
03-02-2014

Release team: Approved for deferral.
12-11-2013

After reading the discussion and code in jdk/src/solaris/native/java/lang/UNIXProcess_md.c, it looks like switching from fork() to vfork() on Linux is too risky for the current stage of the JDK8/HSX-25 release. After HSX-25 we could look at porting the JDK equivalent of os::fork_and_exec() to Linux (and possibly MacOS X).
03-11-2013

This is the file where the defect is. hotspot/src/share/vm/utilities/vmError.cpp 1098 1099 void VM_ReportJavaOutOfMemory::doit() { 1100 // Don't allocate large buffer on stack 1101 static char buffer[O_BUFLEN]; 1102 1103 tty->print_cr("#"); 1104 tty->print_cr("# java.lang.OutOfMemoryError: %s", _err->message()); 1105 tty->print_cr("# -XX:OnOutOfMemoryError=\"%s\"", OnOutOfMemoryError); 1106 1107 // make heap parsability 1108 Universe::heap()->ensure_parsability(false); // no need to retire TLABs 1109 1110 char* cmd; 1111 const char* ptr = OnOutOfMemoryError; 1112 while ((cmd = next_OnError_command(buffer, sizeof(buffer), &ptr)) != NUL L){ 1113 tty->print("# Executing "); 1114 #if defined(LINUX) 1115 tty->print ("/bin/sh -c "); 1116 #elif defined(SOLARIS) 1117 tty->print ("/usr/bin/sh -c "); 1118 #endif 1119 tty->print_cr("\"%s\"...", cmd); 1120 1121 os::fork_and_exec(cmd); <<<<< Location of defect 1122 } 1123 } 1124
03-11-2013

Here is a relevant snippet of code from HotSpot on Solaris: hotspot/src/os/solaris/vm/os_solaris.cpp: 6363 // Run the specified command in a separate process. Return its exit valu e, 6364 // or -1 on failure (e.g. can't fork a new process). 6365 // Unlike system(), this function can be called from signal handler. It 6366 // doesn't block SIGINT et al. 6367 int os::fork_and_exec(char* cmd) { 6368 char * argv[4]; 6369 argv[0] = (char *)"sh"; 6370 argv[1] = (char *)"-c"; 6371 argv[2] = cmd; 6372 argv[3] = NULL; 6373 6374 // fork is async-safe, fork1 is not so can't use in signal handler 6375 pid_t pid; 6376 Thread* t = ThreadLocalStorage::get_thread_slow(); 6377 if (t != NULL && t->is_inside_signal_handler()) { 6378 pid = fork(); 6379 } else { 6380 pid = fork1(); 6381 } On Solaris 'vfork()' is deprecated and not mentioned in the Solaris HotSpot code at all. It looks to me like Solaris uses fork1() when it is safe to do so and fork() otherwise.
03-11-2013

See the discussion and code in jdk/src/solaris/native/java/lang/UNIXProcess_md.c for all the gory details surrounding implementing fork-and-exec functionality.
01-11-2013

Please check Solaris and BSD for MacOS X as well.
30-10-2013

This should be fixed in both JDK7 and JDK8 if possible
29-10-2013