JDK-4725923 : (process) UNIXProcess_forkAndExec hangs (1.4.0_02)
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.lang
  • Affected Version: 1.4.0_02
  • Priority: P4
  • Status: Closed
  • Resolution: Not an Issue
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2002-08-04
  • Updated: 2006-03-15
  • Resolved: 2006-03-15
Related Reports
Relates :  
Relates :  
Description
When available swap space is low (less than the size of JVM process), Runtime.getRuntime().exec(string[]) call results in a hang. The size of the JVM process was 170MB and available swap was nearly 100MB. See following excerpt from pstack output -


-----------------  lwp# 196 / thread# 196  --------------------
 fdf9b710 lwp_cond_wait (fdb960c0, fdb90120, 0)
 fe83f088 cond_wait_common (0, 0, 0, 0, 0, fe852000) + 148
 fe838b24 move_to_safe (fe85292c, 1, 30, c, 1, c) + 104
 fe838dc4 suspend_fork (fe8528b8, 1, ecf00000, fd000000, 0, 0) + 9c
 fe83ad44 _run_prefork (6fa58, fe8535d0, fe83ec60, 6f9d8, 6f9d8, fe8535d0) + 68
 fe83bdc0 fork1    (fe852000, fe853be8, fe852a40, fdfbc578, ecf00000, 0) + 40
 fd2687f8 Java_java_lang_UNIXProcess_forkAndExec (0, f8c520, fd28367c, 0, ffffff
ff, eceff480) + 564
 f940cc24 ???????? (eceff490, f506b730, 0, f94143e4, 0, eceff398)
 f94058b0 ???????? (eceff51c, 0, 0, f9415cf0, 1c, eceff418)
 f940042c ???????? (eceff5a8, eceff758, a, f5c5fdc8, 10, eceff4b0)
 fd544df8 void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments
*,Thread*) (eceff750, eceff658, eceff678, e895d8, e895d8, 6000) + 25c
 fd54d558 void jni_invoke(JNIEnv_*,JavaValue*,_jobject*,JNICallType,_jmethodID*,
JNI_ArgumentPusher*,Thread*) (0, eceff750, 6a49cc, 2, 6c0838, eceff734) + 390
 fd6d96d8 jni_NewObjectV (e8966c, 6a49c8, 6c0838, eceff818, fd9d6000, 0) + 2d0
 fd25c428 JNU_NewObjectByName (e8966c, fd26ff0c, fd26ff24, eceff910, 0, 0) + b0
 fd261340 Java_java_lang_Runtime_execInternal (e8966c, eceff914, eceff910, 0, 0,
 0) + 80

The process does not respond to kill -3, so I do not have java stack trace, but pstack output is attached.

Once the JVM reaches this stage, there is no significant CPU activity and truss just shows calls to lwp_cond_wait and some calls to lwp_mutex_lock.

Stopping other processes to make more memory (swap) available does not affect the process and eventually the only way to exit is kill -9

JRE version - 

java version "1.4.0_02"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0_02-b02)

This was while using server VM. (I haven't tried with client VM)

I would expect that this condition will result in java.io.IOException with the message "not enough space".

For more information, please refer to comments section.

Comments
EVALUATION Since this is a symptom of Solaris bug 4455654 there are many problems with thread suspend/continue (i.e. not a JDK bug) I am closing this as Not a Defect.
15-03-2006

EVALUATION pstack output is attached. From the pstack output the hang could be due to thread #196 and thread # 12 ----------------- lwp# 196 / thread# 196 -------------------- fdf9b710 lwp_cond_wait (fdb960c0, fdb90120, 0) fe83f088 cond_wait_common (0, 0, 0, 0, 0, fe852000) + 148 fe838b24 move_to_safe (fe85292c, 1, 30, c, 1, c) + 104 fe838dc4 suspend_fork (fe8528b8, 1, ecf00000, fd000000, 0, 0) + 9c fe83ad44 _run_prefork (6fa58, fe8535d0, fe83ec60, 6f9d8, 6f9d8, fe8535d0) + 68 fe83bdc0 fork1 (fe852000, fe853be8, fe852a40, fdfbc578, ecf00000, 0) + 40 fd2687f8 Java_java_lang_UNIXProcess_forkAndExec (0, f8c520, fd28367c, 0, ffffff ff, eceff480) + 564 f940cc24 ???????? (eceff490, f506b730, 0, f94143e4, 0, eceff398) f94058b0 ???????? (eceff51c, 0, 0, f9415cf0, 1c, eceff418) f940042c ???????? (eceff5a8, eceff758, a, f5c5fdc8, 10, eceff4b0) fd544df8 void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments *,Thread*) (eceff750, eceff658, eceff678, e895d8, e895d8, 6000) + 25c fd54d558 void jni_invoke(JNIEnv_*,JavaValue*,_jobject*,JNICallType,_jmethodID*, JNI_ArgumentPusher*,Thread*) (0, eceff750, 6a49cc, 2, 6c0838, eceff734) + 390 fd6d96d8 jni_NewObjectV (e8966c, 6a49c8, 6c0838, eceff818, fd9d6000, 0) + 2d0 fd25c428 JNU_NewObjectByName (e8966c, fd26ff0c, fd26ff24, eceff910, 0, 0) + b0 fd261340 Java_java_lang_Runtime_execInternal (e8966c, eceff914, eceff910, 0, 0, 0) + 80 ----------------- lwp# 12 / thread# 12 -------------------- fdf9b710 lwp_cond_wait (fdb96c40, fdb91260, 0) fe83f088 cond_wait_common (0, 0, 0, 0, 0, fe852000) + 148 fe838b24 move_to_safe (fe85292c, 0, 310, c4, 1, c4) + 104 fe838d18 _thrp_suspend (c4, ecf00000, 0, c4, 1, 0) + 188 fd6349bc long os::pd_suspend_thread(Thread*,int) (e895d8, 1, 0, 0, 0, 0) + 1c fd634a0c long SuspendThread_Callback::execute(Thread*) (fcfffb54, e895d8, fd9d6000, fd9ed138, 6e4a70, 0) + c fd8986b8 long Thread::suspend_thread_impl(Suspend_Callback&,Thread::SR_RequestTypes) (0, 1, 3, fd9ed138, 6e4a70, 0) + 238 fd898404 long Thread::suspend_other(Thread::SR_RequestTypes) (e895d8, 3, fd9d6000, fd9ed138, 420614, 0) + b4 fd59a8d8 void ThreadSafepointState::examine_state_of_thread(int) (1002ca0, 0, ffffffff, fda59e04, fda5077c, fd632ddc) + 1ac fd632e34 void SafepointSynchronize::begin() (5800, 5aa4, e895d8, 1002ca0, fd9ed138, 0) + 15c fd71b458 void VMThread::loop() (fd9fd30c, fd9ed384, fd9ed380, 0, 0, 0) + 1c0 fd71a620 void VMThread::run() (2489b8, 0, 0, 0, 0, 0) + 78 fd687094 _start (2489b8, fd000000, 0, 0, 0, 0) + 20 fe8405fc _lwp_start (0, 0, 0, 0, 0, 0) Thread# 196 is trying to fork and seems like is attempting to stop all other threads. Whereas thread #12, the VM thread is trying to get the threads to a safepoint possibly due to a start of a GC. Resulting in a deadlock. This could be Solaris OS issue. ###@###.### 2002-08-06 Here is email from Roger Faulkner: ---- You are a victim of this bug: 4455654 there are many problems with thread suspend/continue It was fixed in Solaris 8 Update 7 (Solaris 8 02/02). Your pstack output shows that this fix is not present. (_run_prefork() is on the stack; it was renamed in the fixed version). You need the latest libthread patch or install S8U7. -- ###@###.### 2002-08-06 Abhijit has verified that 108827-26 patch has solved his problem. Here is the email from Abhijit; -- I installed this patch (it is part of latest Solaris 8 recommended patch cluster) and started my tests at around 11 PM last night. They are still running fine. So, my intent is to declare victory by requiring this patch or Solaris 8 Update 7. -- ###@###.### 2002-08-08
08-08-2002