JDK-5049299 : (process) Use posix_spawn, not fork, on S10 to avoid swap exhaustion
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.4.2_04,5.0,5.0u6,5.0u17,6u6,6u10
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: generic,linux,solaris_9,solaris_10
  • CPU: generic,x86,sparc
  • Submitted: 2004-05-18
  • Updated: 2017-05-16
  • Resolved: 2013-08-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
7u55Fixed 8 b105Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
If you run a "small" program (e.g., a Perl script) from a "big" Java process on
a machine with "moderate" free swap space (but not as much as the big Java
process), then Runtime.exec() fails.

Open Code Review URL - http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-November/022909.html

We can take it in the April 2014 CPU - this is 7u55. Any way we need the fix in 7u-cpu WS to have a nightly results.

I'm not sure what this questions means, are you asking if we should integrate it into 7u60? If so, I'd say yes. There is a lot of demand for this fix in 7.


SQE is ok to take this fix in 7u60. Rob, Please confirm that "nothing changed" (behaviour wise there are code changes in diff) for both Linux and Solaris.

This has been in jdk8 for >2 months without any reports of issues. The changes proposed for jdk7u-dev do not change the default on Solaris or Linux, only OSX. As OSX is more of a developer platform then the risk should not be high.

The fix touches very basic functionality and hence potentially risky. The bug is standing for very long time and SQE does not see a particular reason why this needs to be fixed in 7u60. Is it an escalation?

As per the comment above this is being worked on for jdk8 and is currently at the review stage. On Solaris / Mac OS X we're currently using the standard fork/exec mechanics to launch new processes from a Java VM. Unfortunately, as noted above, some customers are having trouble when attempting to use these calls from Java processes which occupy a lot of memory. Since fork duplicates the address space of the parent process we can very easily run out of space. The obvious solution is to configure adequate swap space, unfortunately in situations where an app is making frequent repeated use of Runtime.exec() et al, then the use of swap space is a erformance bottleneck. The suggested solution is to support posix_spawn for the affected platforms.

http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-November/012417.html webrevs at: http://cr.openjdk.java.net/~robm/5049299/

EVALUATION For information, in the ops-center product we now are using a drop-in replacement for Runtime.exec() which uses the posix_spawn semantics when available (it is available on linux and on S10, not on S9 and S8), falling back to the existing Runtime.exec's current behavior when not avaiable. The replacement has been slightly enhanced above standard Runtime.exec() semantics for ops-center needs to include things like detaching from a parent smf contract, which would need to be stripped for a generic implementation, but its core could be used as the start of a potential fix.

EVALUATION There are a couple of issues with posix_spawn(). 1) It doesn't support doing a chdir() along with the other file descriptor operations that it does after being invoked, but before the target gets exec'd. 2) The style of the API does not suit general purpose multi-threaded environments like Java. In particular, the ability to perform actions on file descriptors inherited by the child, does not work that well, if other threads in the VM are potentially opening and closing files in parallel with the call to posix_spawn(). So, here is the plan. We will use posix_spawn() in a minimal fashion, simply to efficiently spawn a new helper binary (processhelper). This small (12k) binary cleans up the file descriptors inherited from the parent, chdirs() to the new working directory, and then exec's the actual target executable. The new binary will not be noticed by users/applications at runtime, since the end result is the same as before, and the processhelper itself will only run for a very short time.

WORK AROUND 4) Implement the "small" program in pure Java in order to avoid Runtime.exec() 5) Consider using a scripting engine, see also https://scripting.dev.java.net/

EVALUATION I agree that the use of posix_spawn on S10 should be investigated. Historically, changes to this kind of code has been extraordinarily risky due to unforseen race conditions, so this sort of change should be introduced near the beginning of a release. Therefore I am targeting this at dolphin. Hopefully, it will get addressed early in that release.

WORK AROUND 1) mkfile followed by swap -a to add more swap space 2) do Runtime.exec "early" in the application execution before the process has grown so large (i.e. so the transient swap requirement between Runtime.exec's fork and exec calls is big), cache resulting Process object, then replace the "later" Runtime.exec calls that kicked off perl with println or the like to direct the aforementioned process exec perl with the same command line and relay back the perl command's standard output and error traffic. 3) Like (2) but spawn the "exec daemon" separate from Java to avoid any use of Runtime.exec and instead communicate with Java via a pipe or socket to initiate running the perl scripts. exit status. ###@###.### 2004-05-19

EVALUATION Solaris reserves swap space conservatively, so when an X-megabyte process forks the kernel attempts to reserve an additional X MB of swap space just in case the child actually does touch all those pages, thereby making private copies, and then later needs to swap them out. (Linux doesn't do this, so this bug will not be reproducible on a Linux system.) Within the constraints of the existing semantics of Runtime.exec there does not appear to be any way to avoid this in current Solaris releases. vfork(2) is not thread-safe and popen(3C) only provides access to one of the child's standard streams rather than all three of them. S10 does support the new posix_spawn call; we should look into using that when running on S10. See the comments section for additional information. -- ###@###.### 2004/5/19
185-12-05 0