Problem happens when java test that stresses GC receives SIGTERM. It may take long time to exit after test process receives SIGTERM from harness(which is using Process.destroy()). It may be also reproduced manually by running the test and pressing Ctrl-C. During my manual experiments, it took up to 15 minutes. However, during nightly testing, many test processes are left for longer time. It seems that when there are multiple java processes handling SIGTERM, the time may be much longer.
To reproduce it manually:
ssh to solaris-x86 machine (I reproduced it on vm-linux-1.sfbay)
/net/gtee.sfbay/export/nightly/mustang/JDK/gc_baseline/jdk1.6/solaris-i586/bin/java -cp /net/gtee.sfbay/export/gtee/suites/testbase_vm.1.6/vm/bin/classes -server -Xmixed -DHANGINGJAVA13719 -XX:-PrintVMOptions -XX:+UseParallelGC -XX:+UseParallelOldGC -Xms120m gc.gctests.FinalizeTest02.FinalizeTest02
(Press Ctrl-C and observe what happens. Repeated Ctrl-C does not seem to help)
I tried to use jstack on the process after hitting Ctrl-C, but it seems to also hang. Pstack works, several different native thread dumps are attached to this bug report.
This problem severly affects nightly testing: harness cannot timely terminate child test processes and they fill all available memory on the machine, causing unexpected failures.