JDK-8213192 : (process) Change the Process launch mechanism default on Linux to be posix_spawn
Type:Bug
Component:core-libs
Sub-Component:java.lang
Affected Version:13
Priority:P3
Status:Resolved
Resolution:Fixed
Submitted:2018-10-31
Updated:2019-02-13
Resolved:2019-02-13
The Version table provides details related to the release that this issue/RFE will be addressed.
Unresolved : Release in which this issue/RFE will be addressed. Resolved: Release in which this issue/RFE has been resolved. Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.
Change the Process launch mechanism default on Linux to be posix_spawn.
The option was introduced in JDK 12 and with some testing can be made the default.
Comments
Hgupdater seems to have missed the push:
http://mail.openjdk.java.net/pipermail/jdk-changes/2019-February/004207.html
Because the wrong bugid was used in the commit message, it should have been 8213192.
13-02-2019
:) Thanks Martin
06-02-2019
Thomas' analysis is excellent!
06-02-2019
Notes about libc compatibility:
for the two problems we care about:
- fork() overcommit problems
- vfork() dangers (trashing the parent's stack, signals killing both child and parent etc):
1) before glibc 2.4 (released 2006), posix_spawn() used just fork()/exec().
This would be a bad for us, since we would run into the known issues with fork() (memory overcommit etc).
2) between glibc 2.4 and glibc 2.23, posix_spawn uses either fork() or vfork(), following a logic that still is (errornously) described in the manpage today:
<quote>
The child process is created using vfork(2) instead of fork(2) when
either of the following is true:
* the spawn-flags element of the attributes object pointed to by
attrp contains the GNU-specific flag POSIX_SPAWN_USEVFORK; or
* file_actions is NULL and the spawn-flags element of the attributes
object pointed to by attrp does not contain
POSIX_SPAWN_SETSIGMASK, POSIX_SPAWN_SETSIGDEF,
POSIX_SPAWN_SETSCHEDPARAM, POSIX_SPAWN_SETSCHEDULER,
POSIX_SPAWN_SETPGROUP, or POSIX_SPAWN_RESETIDS.
</quote>
coding (sysdeps/posix/spawni.c) looks like this:
93 /* Do this once. */
94 short int flags = attrp == NULL ? 0 : attrp->__flags;
95
96 /* Generate the new process. */
97 if ((flags & POSIX_SPAWN_USEVFORK) != 0
98 /* If no major work is done, allow using vfork. Note that we
99 might perform the path searching. But this would be done by
100 a call to execvp(), too, and such a call must be OK according
101 to POSIX. */
102 || ((flags & (POSIX_SPAWN_SETSIGMASK | POSIX_SPAWN_SETSIGDEF
103 | POSIX_SPAWN_SETSCHEDPARAM | POSIX_SPAWN_SETSCHEDULER
104 | POSIX_SPAWN_SETPGROUP | POSIX_SPAWN_RESETIDS)) == 0
105 && file_actions == NULL))
106 new_pid = __vfork ();
107 else
108 new_pid = __fork ();
109
For the JDK implementation this means we always use vfork() since we do not pass in attributes nor actions. We do not do this since we will run the jspawnhelper which does all the pre-exec() work for us.
This means:
- we will not have the fork() memory problems, since posix_spawn() internally uses vfork()
- there is still a risk associated with vfork() but it is tiny since posix_spawn() will immediately exec() the jspawnhelper - and once that first exec() is through, we are safe. This is similar to the "exec-twice-technique" described in http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-September/055333.html .
3) glibc >= 2.24 uses a new linux variant in
sysdeps/unix/sysv/linux/spawni.c
365 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size,
366 CLONE_VM | CLONE_VFORK | SIGCHLD, &args);
This is even better than (2):
CLONE_VM means we run in the parent's memory image, as with (2)
CLONE_VFORK means parents waits until we exec, as with (2)
But, the error possibilities here are further reduced since:
- we pass an own stack for the child to run on. This means we do not run on the stack of the forking thread in the parent.
- posix_spawn takes care to temporarily block all incoming signals until the exec() is through.
---
In short,
(1) would be a bad regression in memory intensive scenarios. But since this only affects very old glibcs, I think we are safe. Even the most conservative distros would by now have glibc versions >=2.4.
(2) is not perfect but still better than our existing vfork() solution, since the error time window is greatly reduced
(3) seems quite nice.
######
In addition, I took a look at muslc, since we do not want regressions in portola either. It seems they always did clone (.. CLONE_VM | CLONE_VFORK ...) technique. So in theory we are safe there too.