JDK-4466587 : JVM causes segmentation fault on Mandrake 8.0, SuSE 7.2
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.3.1,1.4.0
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2001-06-06
  • Updated: 2012-10-08
  • Resolved: 2001-07-10
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.0 beta2Fixed
Related Reports
Duplicate :  
Duplicate :  
Relates :  
Relates :  
Description

Name: bsC130419			Date: 06/05/2001


java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)

This is a resubmission of bug #124860.

All the following tests were done on two linux boxes:

1) Red Hat Linux 7.0 (Guinness) using kernel 2.2.16-22 runing on a P-III 733MHz.
 GNOME 1.2 is the desktop environment.  This was installed about four months ago
and has run in a very stable since.

2) Linux Mandrake 8.0 (Traktopel) using kernel 2.4.3-20mdk #1 running on an
Athlon-C 1333MHz. GNOME 1.4 is the desktop environment.  This install is
relativly fresh (the only non-standard packages loaded are J2SDK1.4.0, pine, and
Folding@Home).

I have not tried to replicate the bug on any other distribution or system.  The
bug does not appear with JDK 1.2.2 or the IBM JDK 1.3.0; I do not know about
other JDK's since these where the only two on hand at the time.

The following code snipped incorrectly produces a segmentation fault insteam of
a StackOverflowError:

public class SegFaultTest
{

    public static void main(String[] args)
    {
        new SegFaultTest();
    }

    public SegFaultTest()
    {
        new SegFaultTest();
    }

}

It was compiled and run with:

javac SegFaultTest.java
java SegFaultTest

And produced:
Segmentation fault

As noted previously, this also occurs under other conditions, such as a
NullPointerException.  It seems to occur when there are deeply recursive calls
during construction.  I have managed to construct another code snipped that
demonstrates this situation:

public class SegFaultTestNpe
{

    public static void main(String[] args)
    {
        new SegFaultTestNpe(Integer.parseInt(args[0]));
    }

    public SegFaultTestNpe(int depth)
    {
        if (depth == 0)
        {
            ((String)null).length();
        }
        else
        {
            new SegFaultTestNpe(depth - 1);
        }
    }

}

It was compiled and run with:

javac SegFaultTestNpe.java
java SegFaultTestNpe <depth>

With the value for <depth> between 0 and 423 inclusive, it generates the
expected NullPointerException and stack trace.  With a value of 424 however, a
segmentation fault is generated.

Also, this is not restricted to occuring in constructors as previously thought.
 It also occurs with exceptions thrown during normal recursive code, such as:

public class SegFaultTestNpeNc
{

    public static void main(String[] args)
    {
        SegFaultTestNpeNc sf = new SegFaultTestNpeNc();
        sf.recurse(Integer.parseInt(args[0]));
    }

    public SegFaultTestNpeNc()
    {
    }

    public void recurse(int depth)
    {
        if (depth == 0)
        {
            ((String)null).length();
        }
        else
        {
            recurse(depth - 1);
        }
    }

}

I can probably construct further examples based around the same idea.  When you
actually write correct recursive algorithms, they provide the correct result.
The problem only occurs when there is an exception thrown deep in a recursive
call.  The most worrying thing about this bug is that there are situations that
an exception is thrown and should be handled in deep recursion.  Simply replace
the line "((String)null).length();" in the above snipped with "throw new
IOException();", (makeing the corresponding changes to the method declarations
as well) and the same error will occur.

If this bug is something peculiar to both these systems listed, it is probably a
very good idea to try and track down what exactly is causing this problem as it
seems to be more common than on one linux install on one type of machine using a
specific kernel version.  The only idea that I can offer is that because the
stack trace itself is so long, it may be longer than some internal, undocumented
limit and this is causing an overrun in native code somewhere.
(Review ID: 125179) 
======================================================================

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: merlin-beta2 FIXED IN: merlin-beta2 INTEGRATED IN: merlin-beta2
14-06-2004

EVALUATION The problem is not reproducible on Redhat 6.1 or Redhat 7.1. I haven't tried it on Redhat 6.2, but it should be OK. The testcase only crashes on Redhat 7.0. The stack layout on RH 7.0 is slightly different from that on RH 6.x or 7.1. That might be the cause. hui.huang@Eng 2001-06-12 The thread_self() implementation in glibc 2.2.x cannot handle thread stack larger than 6M correctly if glibc is not compiled with the flag "--enable-kernel=2.4.0", as is the case for Redhat 7.0, SuSE 7.2 and Debian Linux. Normally pthread will enforce 2M maximum stack size when it creates a new thread, but the initial thread is created by Linux kernel and its size determined by "ulimit -s". pthread library has no control over the initial thread stack size. If it's larger than 6M (most platforms use 8M or "unlimited" default size for the initial thread), glibc/pthread will crash once the current stack size exceeds 6M, or when a signal that requires alternate signal stack (e.g. SIGSEGV) is sent to the thread. I'm not sure if the problem will be fixed in glibc, since the problemetic code is probably obsolete. The fix in VM is to limit the maximum stack size for initial thread to be 2M. hui.huang@Eng 2001-06-26 The reason that this problem does not reproduce on Redhat 6.x is pthread in glibc-2.1.x will setrlimit() during initialization and effectively limit the initial thread stack size to under 2M. glibc-2.2.x does not call setrlimit(), probably because it can handle large stack in "floating stack" mode (i686 version). But for "fixed stack" mode (i386 version of glibc-2.2.x), this is a bug. It is especially harmful to VM, because VM will put alternate signal stack at the lower end of thread stack. If the initial thread stack size is larger than 6M, pthread library will think the alternate signal stack of initial thread belongs to a different thread. This will cause wrong thread pointer being retrieved from thread local storage and crash in the pthread library. Fixed in Merlin beta-refresh by limiting max stack size for initial thread to 2M. Changed bug synopsis to reflect the nature of the crash. hui.huang@Eng 2001-06-27 It is highly recommended to limit the max thread stack size under 2M. There is probably legacy code in the pthread library or user code that still assumes the old 2M fixed thread stack. hui.huang@Eng 2001-07-11 Verified on Mandrake 8.0 that this is fixed in 1.4 beta3 and that 1.3.1 did crash on Mandrake 8. Need to update release notes ###@###.### 2001-09-27
11-07-2001

WORK AROUND Reduce the default stack size. At bash shell, do "ulimit -s 2048"; use "limit stacksize 2048" for tcsh. hui.huang@Eng 2001-06-26
26-06-2001