United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-4656697 Linux: VM hang when Java program exits
JDK-4656697 : Linux: VM hang when Java program exits

Details
Type:
Bug
Submit Date:
2002-03-22
Status:
Closed
Updated Date:
2002-03-22
Project Name:
JDK
Resolved Date:
2002-03-22
Component:
hotspot
OS:
linux
Sub-Component:
runtime
CPU:
generic
Priority:
P4
Resolution:
Won't Fix
Affected Versions:
1.4.1
Fixed Versions:

Related Reports
Relates:

Sub Tasks

Description
This bug is filed to document a known glibc problem.

Sometimes VM will hang when a Java program has finished its execution and tries
to exit by calling "System.exit()" or returning from the main() function. 
When that happens, there is only one remaining Java thread of the program.

This is a bug in how glibc-2.2.x handles program exit. The program will hang
on exit if one of the threads happen to allocate or deallocate memory at
the time. LinuxThreads up to 2.2.4 tries to "free" the manager thread stack
after all user threads have been killed. It is unsafe because user threads
might get killed when they are holding the malloc lock.

Here is a Java testcase:

---------------------------------- ShutdownMallocTest.java -------------
import java.io.*;

public class ShutdownMallocTest extends Thread{

   public native void foo();

   static {
     System.loadLibrary("ShutdownMallocTest");
   }

   public void run() {
     foo();
   }

   public static void main(String args[]) {
     System.out.println("- ShutdownMallocTest -");

     for (int i = 0; i < 4; i++) {
       ShutdownMallocTest smt = new ShutdownMallocTest();
       smt.setDaemon(true);
       smt.start();
     }
   }
}
---------------------------------- ShutdownMallocTest.c -----------------
#include <jni.h>

JNIEXPORT void JNICALL Java_ShutdownMallocTest_foo (JNIEnv * env, jobject obj)
{
   while (1) {
     malloc(1);
   }
}
-------------------------------------------------------------------------
To build the testcase:
  javac ShutdownMallocTest.java
  gcc -g -shared -I${JAVA_HOME}/include -I${JAVA_HOME}/include/linux ShutdownMallocTest.c -o libShutdownMallocTest.so
-------------------------------------------------------------------------

When that hangs, "ps -A|grep java" returns only one java thread:

raq:~> ps -A|grep java
17111 pts/2    00:00:00 java

If you use gdb to attach to the thread, you can see it hangs in libc_free:

(gdb) where
#0  0x40075aa5 in __sigsuspend (set=0xbffff600)
    at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x40037079 in __pthread_wait_for_restart_signal (self=0x4003fd60)
    at pthread.c:967
#2  0x40038d39 in __pthread_alt_lock (lock=0x4017ba40, self=0x0)
    at restart.h:34
#3  0x40035c16 in __pthread_mutex_lock (mutex=0x4017ba30) at mutex.c:120
#4  0x400c7be8 in __libc_free (mem=0x8081188) at malloc.c:3152
#5  0x4003732c in pthread_onexit_process (retcode=0, arg=0x0) at pthread.c:796
#6  0x4007842b in exit (status=0) at exit.c:54
#7  0x40063510 in __libc_start_main (main=0x8048c60 <strcpy+200>, argc=2, 
    ubp_av=0xbffff8b4, init=0x80488e8, fini=0x804ba0c <strcpy+11892>, 
    rtld_fini=0x4000dc14 <_dl_fini>, stack_end=0xbffff8ac)
    at ../sysdeps/generic/libc-start.c:129

The bug was introduced in glibc-2.2 and it has been fixed in 2.2.5.

Most Linux distributions today include some version of glibc-2.2.x, so they
all could be affected. But since the hang happens at the very end of a program's
life cycle, user can simply kill the last remaining thread when that happens.

                                    

Comments
EVALUATION

The problematic code is in pthread_onexit_process() (pthread.c):

    /* Main thread should accumulate times for thread manager and its
       children, so that timings for main thread account for all threads. */
    if (self == __pthread_main_thread)
      {
        waitpid(__pthread_manager_thread.p_pid, NULL, __WCLONE);
(*)     free (__pthread_manager_thread_bos);
        __pthread_manager_thread_bos = __pthread_manager_thread_tos = NULL;
      }

(*) this "free" may hang if manager thread kills a user thread when it
    is holding the malloc lock. Deallocating manager thread stack is
    unnecessary because the operating system will soon reclaim everything
    once this onexit function returns.

The problem is fixed in glibc-2.2.5 by removing the "free" call. There is
nothing we can do in VM for this bug.

###@###.### 2002-03-21
                                     
2002-03-21



Hardware and Software, Engineered to Work Together