United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6533291 Work around 32-bit Solaris stdio limit of 256 open files
JDK-6533291 : Work around 32-bit Solaris stdio limit of 256 open files

Details
Type:
Bug
Submit Date:
2007-03-12
Status:
Closed
Updated Date:
2011-05-18
Project Name:
JDK
Resolved Date:
2011-05-18
Component:
core-libs
OS:
solaris_10
Sub-Component:
java.lang
CPU:
x86
Priority:
P2
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:

Related Reports
Backport:
Relates:
Relates:

Sub Tasks

Description
FULL PRODUCT VERSION :
JRE 1.6.0-b105 (FCS)

ADDITIONAL OS VERSION INFORMATION :
Solaris 10 - 64-bit, SuSE 9, Debian

A DESCRIPTION OF THE PROBLEM :
We have a Java application with a large number of jar files which are included in the classpath.  With JRE-1.6.0-b105, when an application is running which loads classes from any of these jar files, /proc/<pid>/fd shows that there is an open file descriptof for every jar file in the classpath.  This is not the case with JRE-1.5.0_xx.  We are having issues on Solaris 10 with "too many open files" when Java launches a subprocess running the SunStudio 11 cc compiler.

We are trying to upgrade to JRE 6 as a subsystem of MATLAB, but we can't get through our acceptance tests on Solaris with this problem.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run "java -classpath <path to 184 jar files with ":" separators> <class>"
where <class> can be found in one of the jar files in the classpath.  Note that there is an open file descriptor for each jar file in /proc/<java-pid>/fd on either Solaris or Linux systems when running with jre-1.6.0-b105.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Behavior similar to jre-1.5.0_07, where only key files (e.g. rt.jar) are held open by the jre.
ACTUAL -
The Java program actually runs correctly, but the large number of open file descriptors seems to have a side effect when launching a "cc" sub-process from java.

ERROR MESSAGES/STACK TRACES THAT OCCUR :
On Solaris, when the Java program launches a "cc" subprocess, it gets several "Too many open files" messages when processing the #include statements.  On Solaris "cc" is a 32-bit application, even on a 64-bit system.

REPRODUCIBILITY :
This bug can be reproduced always.

CUSTOMER SUBMITTED WORKAROUND :
We haven't found one yet.  There is a possibility that this new behavior in JRE-1.6.0 is intentional as a performance enhancement, in which case it would be desirable to have an option (e.g. a -XX: option) to disable this behavior.

                                    

Comments
EVALUATION

I strongly suspect this bug might be a side effect of fixing 6280693 which we put back around b45 of JDK 6. Before the fix, jar files get mapped into address space and then we changed the behavior to read all the jar files instead. Refer to that bug for more details about why we did this.

I don't know whether we should delegate this bug to our Sun studio team to let them take a look since essentially the error is from them. The studio might have a limitation about the number of files to be opened for some reason and it would be nice to know what it is. Providing an option to turn to old behavior might be the ultimate way to go if Sun studio folks think they can't handle more than some number of open files for some legitimate reasons.
                                     
2007-03-12
WORK AROUND

There are obvious workarounds to the problem of too many jar files:
- coalesce some jar files into larger jar files
- unjar some of the jar files into directories, perhaps even of the same name.
  It's a little strange to have a directory named foo.jar, but I think 
  it's likely to work.
- promptly close any Java object (such as FileOutputStream) consuming a file descriptor.
- switch to a 64-bit JVM on Solaris (not possible when interfacing to legacy 
  32-bit libraries)
- It appears that in JDK6+, we call enable_extended_FILE_stdio() to avoid this problem
  on Solaris versions where this is available.  In practice, this means customers
  should be able to avoid this issue by upgrading to 
  JDK6 running on Solaris 11 or Solaris 10u4.
                                     
2007-06-19
EVALUATION

There are actually two separate bugs being addressed here.

Failure mode 1 - user has native libraries that are part of the JVM process
that call fopen, and fail because the JVM has used up all file descriptors less than 256.

Failure mode 2 - user has native code that forks and execs, but fails to properly
close all file descriptors not needed by the child process.
(Understandable because doing this reliably is quite difficult;
the JDK engineers needed several tries to get it right)
The child process then calls fopen, again failing because 
all file descriptors less than 256 are used.

Here is the fix to be integrated into jdk7, which contains an explanation:


    /*
     * 32-bit Solaris systems suffer from:
     *
     * - an historical default soft limit of 256 per-process file
     *   descriptors that is too low for many Java programs.
     *
     * - a design flaw where file descriptors created using stdio
     *   fopen must be less than 256, _even_ when the first limit above
     *   has been raised.  This can cause calls to fopen (but not calls to
     *   open, for example) to fail mysteriously, perhaps in 3rd party
     *   native code (although the JDK itself uses fopen).  One can hardly
     *   criticize them for using this most standard of all functions.
     *
     * We attempt to make everything work anyways by:
     *
     * - raising the soft limit on per-process file descriptors beyond
     *   256 (done by hotspot)
     *
     * - As of Solaris 10u4, we can request that Solaris raise the 256
     *   stdio fopen limit by calling function enable_extended_FILE_stdio,
     *   (also done by hotspot).  We check for its availability.
     *
     * - If we are stuck on an old (pre 10u4) Solaris system, we can
     *   workaround the bug by remapping non-stdio file descriptors below
     *   256 to ones beyond 256, which is done below.
     *
     * See:
     * 1085341: 32-bit stdio routines should support file descriptors >255
     * 6533291: Work around 32-bit Solaris stdio limit of 256 open files
     * 6431278: Netbeans crash on 32 bit Solaris: need to call
     *          enable_extended_FILE_stdio() in VM initialisation
     * Giri Mandalika's blog
     * http://technopark02.blogspot.com/2005_05_01_archive.html
     */
#if defined(__solaris__) && defined(_ILP32)
    {
	static int needToWorkAroundBug1085341 = -1;
	if (needToWorkAroundBug1085341) {
	    if (needToWorkAroundBug1085341 == -1)
		needToWorkAroundBug1085341 =
		    (dlsym(RTLD_DEFAULT, "enable_extended_FILE_stdio") == NULL);
	    if (needToWorkAroundBug1085341 && fd < 256) {
		int newfd = fcntl(fd, F_DUPFD, 256);
		if (newfd != -1) {
		    close(fd);
		    fd = newfd;
		}
	    }
	}
    }
#endif /* 32-bit Solaris */

    /*
     * All file descriptors that are opened in the JVM and not
     * specifically destined for a subprocess should have the
     * close-on-exec flag set.  If we don't set it, then careless 3rd
     * party native code might fork and exec without closing all
     * appropriate file descriptors (e.g. as we do in closeDescriptors in
     * UNIXProcess.c), and this in turn might:
     *
     * - cause end-of-file to fail to be detected on some file
     *   descriptors, resulting in mysterious hangs, or
     *
     * - might cause an fopen in the subprocess to fail on a system
     *   suffering from bug 1085341.
     *
     * (Yes, the default setting of the close-on-exec flag is a Unix
     * design flaw)
     *
     * See:
     * 1085341: 32-bit stdio routines should support file descriptors >255
     * 4843136: (process) pipe file descriptor from Runtime.exec not being closed
     * 6339493: (process) Runtime.exec does not close all file descriptors on Solaris 9
     */
#ifdef FD_CLOEXEC
    {
	int flags = fcntl(fd, F_GETFD);
	if (flags != -1)
	    fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
    }
#endif

Even with this fix, the submitter's code is incorrect.

- They should strive to close all file descriptors not needed by the child process.

- They should try to incorporate a call to enable_extended_FILE_stdio() 
  (where available) in the child process (where they control the source code)

  Here's a code snippet:

  enable_extended_FILE_stdio_t enabler = 
    (enable_extended_FILE_stdio_t) dlsym(RTLD_DEFAULT, 
                                         "enable_extended_FILE_stdio");
  if (enabler) {
    enabler(-1, -1);
  }
                                     
2007-06-22



Hardware and Software, Engineered to Work Together