JDK-4703547 : Use of JDK1.3.1 with iWS6.0sp2 or Apache causes unstable JVM which can lead to a
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.3.0_03,1.3.1
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2002-06-18
  • Updated: 2012-10-08
  • Resolved: 2002-08-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.3.1_05 05Fixed
Related Reports
Duplicate :  
Description
The customer is currently seeing a problem with iPlanet Web Server 6.0sp2 stability when using the JDK1.3.1_02 and JDK1.3.1_03.  Their test case and pre-production application both crash the ns-httpd process when using JDK 1.3.1, but not with JDK1.2.2 or with JDK1.4.0. Customer needs to be on JDK1.3.1 due to slower performance when using JDK1.2.2 and also with some problems with CORBA implementation with socket handling.

We've been able to isolate the problem down to a difference in JDK versions that affect stability of the application running on iWS6.0sp2.  Without any application changes or iWS configuration changes, the application runs stable on JDK1.2.2_08 in the customer's environment, but when they go to JDK1.3.1_03, the ns-httpd process is crashing on a consistent basis.

When running Apache/Tomcat with JDK1.3.1, the customer is also seeing stability issues, but not with JDK1.2.2 and JDK1.4.0.  The error messages and crashes are similar.  

The customer believes that the problem is primarily with JDK1.3.1's JVM.

Mon Jun 17 18:58:03 MDT 2002	rn104575

Mon Jun 17 18:59:20 MDT 2002	rn104575

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.3.1_05 FIXED IN: 1.3.1_05 INTEGRATED IN: 1.3.1_05
14-06-2004

WORK AROUND None to date. Mon Jun 17 18:58:03 MDT 2002 rn104575
11-06-2004

SUGGESTED FIX *** os_solaris.cpp.org Wed Jun 26 10:08:25 2002 --- os_solaris.cpp Tue Jul 2 10:05:50 2002 *************** *** 2055,2060 **** --- 2055,2066 ---- sigAct.sa_flags = SA_SIGINFO | SA_RESTART; } + if (sig == SIGPIPE) { + sigAct.sa_handler = SIG_IGN; + sigAct.sa_flags = SA_RESTART; + } + + sigaction(sig, &sigAct, &oldAct); void* oldhand2 = oldAct.sa_sigaction ? CAST_FROM_FN_PTR(void*, oldAct.sa_sigaction) ###@###.### 2002-08-07
07-08-2002

EVALUATION JPSE team will look into this further. ###@###.### 2002-06-18 Investigated further, seems to have narrowed down on the problem. Basically the JVM was ignoring the SIGPIPE(an asynch signal), however the way the ignoring was done was just a return, instead of actually telling the Solaris os to ignore it like for ex: by using sigignore() . Every time GC happens VM tries to get all the threads to safepoint for GC activity, and for this it uses SIGILL. The theory is that we are getting hit with a SIGPIPE and SIGILL(sometime SEGV basically synch. signal) at about the same time, and the target thread getting these signals ends up in a wierd trace that doesn't make any sense like for ex: current thread: t@125 [1] __lwp_sema_wait(0x4, 0x0, 0x0, 0x0, 0x0, 0x2), at 0xfeb1bfb0 [2] _park(0xf2621e30, 0xfebbe000, 0x0, 0xf2621d70, 0x24d54, 0xfb521d70), at 0x feb99af4 [3] _swtch(0xf2621d70, 0xf2621d70, 0xfebbe000, 0x5, 0x1000, 0x0), at 0xfeb997b c [4] _mutex_adaptive_lock(0xfebc98ec, 0x4c00, 0x1000, 0xfffeffff, 0x1, 0x4d58), at 0xfeb9b178 [5] _cmutex_lock(0x23eab8, 0xfebbe000, 0xf2620, 0xfd665680, 0xf262005c, 0x205a dc), at 0xfeb9aeb0 [6] Mutex::lock_without_safepoint_check(0xfd86b268, 0x23ea78, 0x23eab0, 0xfd69 [17] Exceptions::new_exception(0xf2620710, 0x5b6920, 0xf2620704, 0xf2620708, 0 xf2620710, 0xf2620704), at 0xfd533c9c [18] Exceptions::_throw_msg(0x5b6920, 0xfd84e7f4, 0x2fa, 0xf2620780, 0x0, 0xfd 84e7f4), at 0xfd533354 [19] Runtime1::throw_null_exception(0xf70016d8, 0xfd86b268, 0x5b6920, 0x0, 0x0 , 0x0), at 0xfd7d6920 [20] 0xf94013d0(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xf94013cf [21] 0xf95ef3e4(0x0, 0xfebbeff0, 0x0, 0x0, 0x0, 0x0), at 0xf95ef3e3 =>[22] sigacthandler(0xd, 0xf2621d70, 0x34, 0xf2621e14, 0xf2620c8c, 0xfebbe000), at 0xfeba83e4 ---- called from signal handler with signal 13 (SIGPIPE) ------ [23] 0x0(), at 0xffffffffffffffff dbx: core file read error: address 0x34 not in data space In the above trace it seems more like a SIGSEGV and SIGPIPE coming at the same time. JVM uses SIGSEGV for null pointer exceptions. The vm thread stack was something like: (.../export/sparc/opt/SUNWspro/SC6.1/bin/../WS6U1/bin/sparcv9/dbx) where current thread: t@16 =>[1] __sigprocmask(0x0, 0xfe108be0, 0x0, 0x0, 0x0, 0x0), at 0xfeba9794 [2] _resetsig(0xfebabf6c, 0x0, 0x0, 0xfe109d70, 0xfebbe000, 0x0), at 0xfeb9e9a 0 [3] _sigon(0xfe109d70, 0xfebc5938, 0x6, 0xfe108cb4, 0xfe109d70, 0x6), at 0xfeb 9e140 [4] _thrp_kill(0x0, 0x10, 0x6, 0xfebbe000, 0x10, 0xfeb3e448), at 0xfeba1180 [5] raise(0x6, 0x0, 0x0, 0xffffffff, 0xfeb3e3b4, 0x4), at 0xfeacb758 [6] abort(0xfeb3a000, 0xfe108e08, 0x0, 0xfffffff8, 0x4, 0xfe108e29), at 0xfeab 5a7c [7] os::abort(0x1, 0xfd86b26c, 0x1, 0xfd86b26c, 0xfe108e24, 0x1f0c70), at 0xfd 67be40 [8] report_error(0x44, 0xfd880d90, 0xa, 0xfd7faaf8, 0xfd8ba9a8, 0xfd86b26c), a t 0xfd52be48 [9] frame::oops_do(0xfeba83ec, 0xfd69d8dc, 0xfe10987c, 0xfd86b26c, 0xfe10986c, 0xe), at 0xfd541960 [10] JavaThread::oops_do(0x0, 0xfd69d8dc, 0xf32d8ac0, 0xf71e3d6c, 0xf71e3d6c, 0xf32d6a8c), at 0xfd6f0c70 [11] Threads::oops_do(0xfd86b26c, 0x760d18, 0xfd69d8dc, 0x20, 0x6bdc34, 0xfd69 d8dc), at 0xfd6f41dc [12] Scavenge::invoke_at_safepoint(0x0, 0xfd90338c, 0xfd8bdc74, 0xfd86b26c, 0x fd88f0e8, 0xf2b704a4), at 0xfd69efe0 [13] VM_Operation::evaluate(0xf2b70478, 0xfd8ba9a8, 0xfd880da0, 0xfd86b26c, 0x fe109aec, 0xf2b70478), at 0xfd70ceb4 [14] VMThread::evaluate_operation(0x1f0c70, 0xfd86b26c, 0xf2b70478, 0x1, 0x3e8 , 0xfd82a3b8), at 0xfd70b778 [15] VMThread::loop(0xfd8984b4, 0xfd8984b8, 0xfd8984b8, 0xfd881250, 0xfd88a7b4 , 0xfd881258), at 0xfd70bd44 [16] VMThread::run(0x1f0c70, 0xfd86b26c, 0x1f0c70, 0x0, 0x0, 0x0), at 0xfd70b4 fc [17] _start(0xfd86b26c, 0xfebc4748, 0x0, 0x5, 0x1, 0xfe401000), at 0xfd67ace0 Or in some cases VM thread was : (.../export/sparc/opt/SUNWspro/SC6.1/bin/../WS6U1/bin/sparcv9/dbx) where current thread: t@16 =>[1] oopDesc::copy_to_survivor_space(0xfbd21d70, 0x0, 0xfbd21d70, 0xc, 0xfe1095 dc, 0x2191a0), at 0xfd672a5c [2] Scavenge::scavenge_oop_with_check(0xfbd205fc, 0xfbd205fc, 0xfe10987c, 0x44 , 0x0, 0x19), at 0xfd69d750 [3] OopMapSet::all_do(0xfd757d90, 0xffe0, 0xffe0, 0x1, 0xfd8a1f70, 0xfd69d6cc) , at 0xfd758a60 [4] OopMapSet::oops_do(0xfe10986c, 0xf9404a10, 0xfe10987c, 0xfd69d6cc, 0x0, 0x 13365c), at 0xfd757e0c [5] frame::oops_do(0xf9404a10, 0xfd69d6cc, 0xfe10987c, 0xfd86b268, 0xfe10986c, 0xe), at 0xfd54184c [6] JavaThread::oops_do(0x0, 0xfd69d6cc, 0xf3210568, 0xfd86b268, 0xf32104d0, 0 xf3210438), at 0xfd6f0a68 [7] Threads::oops_do(0xfd86b268, 0xb74710, 0xfd69d6cc, 0x20, 0x468794, 0xfd69d 6cc), at 0xfd6f3fcc [8] Scavenge::invoke_at_safepoint(0x0, 0xfd90337c, 0xfd8bdc64, 0xfd86b268, 0xf d88f0e8, 0xfcdc04ec), at 0xfd69edd0 [9] VM_Operation::evaluate(0xfcdc04c0, 0xfd8ba998, 0xfd880da0, 0xfd86b268, 0xf e109aec, 0xfcdc04c0), at 0xfd70cca4 [10] VMThread::evaluate_operation(0x2191a0, 0xfd86b268, 0xfcdc04c0, 0x0, 0x3e8 , 0xfd82a3c0), at 0xfd70b568 [11] VMThread::loop(0xfd8984b4, 0xfd8984b8, 0xfd8984b8, 0xfd881250, 0xfd88a7b4 , 0xfd881258), at 0xfd70bb34 [12] VMThread::run(0x2191a0, 0xfd86b268, 0x2191a0, 0x0, 0x0, 0x0), at 0xfd70b2 ec [13] _start(0xfd86b268, 0xfebbf690, 0x1, 0x1, 0xfebbe000, 0x0), at 0xfd67ab70 In all these failure instances the Vm thread was in GC scanning oops and the target thread stack was as shown above, result of SIGPIPE and SEGV/ILL coming at the same time. The sympotms in all the crashes we have observed is that the target thread's stack didn't make any sense, meaning the stack is getting corrupted mostly due to the signal mechanism. There could be libthread involvement here. Fix was to ignore the SIGPIPE instead of just returning that way os would never deliver the PIPE. Thanks to Tom and Ken for their help on this issue. ###@###.### 2002-07-02
02-07-2002