JDK-4812466 : VM crash when stack size set to unlimited on solx86 platform.
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.4.2
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_8
  • CPU: x86
  • Submitted: 2003-02-04
  • Updated: 2003-03-03
  • Resolved: 2003-02-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.2 b17Fixed
Related Reports
Relates :  
Relates :  
Description
VM running solx86 platform will crash immediately when stack size set to unlimited.

Customer report this problem when they try to submit simple java application 
through SunONE GridEngine. By default, SunONE GridEngine set stack size to 
unlimited in execution hosts. Customer will always see this crash if they simply
accept the default value.(most customer just accept default value)

To reproduce this bug:
1. Login to a solx86 machine
2. set stack size to unlimited
   $  ulimit -H -s  unlimited ; ulimit -s  unlimited
3. run java -version

From Andy Schwierskott <###@###.###>
To Grid Engine Support Group <###@###.###>
Subject CPRE Re: Sun Grid Engine 5.3 and java app sigabrt (fwd)

Hi,

just FYI, the reason for the crashed Java app was the "infinity" resource
limit setting of the queue.

I'm wondering how someone can write a software which crashes when the stack
size is set to infinity? All kind of software and any OS version seems to be
vulnerable, Java, simulation apps on Linux, an older version of the IBM C
compiler on AIX...

This is a good example where you really don't know of such a problem is
related to the operating system or not. Probably the customer will tell you
"it runs on XYZ, but crashes on ABC" and you easily might focus your support
efforts on the wrong area;-)

Andy


---------- Forwarded message ----------
Date: Wed, 29 Jan 2003 18:02:58 -0700 (MST)
From: Geoff Shipman <###@###.###>
To: ###@###.###
Subject: Re: Sun Grid Engine 5.3 and java app sigabrt

Andy,


Thanks for the update on the alias I have corrected that in my mailtool.  I
recevied word from cu that setting the ulimit values to what the shell was
outside od SGE worked for him.  They were happy and gave the OK to close the
case.

Thanks

}From: Andy Schwierskott <###@###.###>
}X-X-Sender: as114086@sr-ergb01-01
}To: Geoff Shipman <###@###.###>
}cc: ###@###.###
}Subject: Re: Sun Grid Engine 5.3 and java app sigabrt
}MIME-Version: 1.0
}
}Geoff,
}
}(please use the alias "###@###.###" for our internal support alias)
}
}I have a guess which turned out to be true in many similar cases: By default
}the SGE queue config sets all Unix resource limits to "unlimited". Some
}applications seem not to be able to handle an "infinite" stack size.
}
}Try to either configure "standard" limits in the queue config or add
}"ulimit" calls for hard and soft limits in the script before the application
}is called (just use these values you get with "ulimit -a" and "ulimit -a
}-H" in the user shell).
}
}Andy

}
}
}
}
}> Hello all,
}>
}> I have a customer running Sun Grid Engine 5.3 on Solaris 8 X86 systems that
has
}> Linux systems that submit jobs to the grid engine.
}>
}> CU has encountered a SIGABRT when issuing a hello world type of java app to
the
}> grid engine.  This java app works fine outside of sun grid engine.
}>
}> Is this a known bug or am I missing something please let me know.
}>
}> I am attaching the working and broken trusses from cu as well as the script
he
}> uses to submit the job and the java code plus the classes used.   I do not
have
}> a grid environment setup so I am unable to duplicate.
}>
}> Please reply to me directly as I am not on this alias.
}>
}> Thanks
}>
}> Here is the Java sample code ...
}>
}> package test;
}> public class Tester {
}> public static void main(String[] args) {
}> System.out.println("hello world!");
}>   }
}> }

}>
}> Here is the shell script that submits the job ...
}>
}> #!/usr/bin/bash
}>
LD_LIBRARY_PATH=/usr/local/j2re1.4.0_03/lib:/usr/local/j2re1.4.0_03/i386:/metro1
}> /opt/dba/sge/lib:/metro1/opt/dba/sge/lib/solaris86
}> CLASSPATH="/metro1/opt/dba/meerkat/classes"
}> export CLASSPATH
}>
}> set
}> #ulimit -a
}> truss -faeo /tmp/java.out.truss -wall /usr/local/j2re1.4.0_03/bin/java
}> test.Tester
}>
}>
}> Geoff Shipman
}> Technical Support Engineer ( OS )
}> OS Team
}

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: mantis-beta FIXED IN: mantis-beta INTEGRATED IN: mantis-b17 mantis-beta tiger-beta VERIFIED IN: mantis-beta
14-06-2004

SUGGESTED FIX The fix is to cut down such a silly stack size value returned by Solaris to at most the integral value of the stack base address. ###@###.### 2003-02-04
04-02-2003

EVALUATION On solaris x86, when the stack size was set to unlimited, we triggered a guarantee in get_thread_via_cache_slowly on the primordial thread. Solaris returns maxint (0x7fffffff) as the stack size from thr_stksegment() in this case, which wraps the address space if taken literally. ###@###.### 2003-02-04
04-02-2003