JDK-4486978 : libthread panic: fault in libthread critical section (PID: 3014 LWP 1)
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.3.1
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2001-08-01
  • Updated: 2012-10-08
  • Resolved: 2002-01-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other Other Other
1.3.1_03 03Fixed 1.4.0_01Fixed 1.4.1Fixed
Related Reports
Relates :  
Relates :  
Description
customer's code repeatedly calls Runtime.exec(String[], String[], File);
ond they get the following error:

signal fault in critical section
signal number: 11, signal code: 1,                          fault address: 0xfee0bdd4, pc: 0xff36ad00, sp: 0xf0280a78
libthread panic: fault in libthread critical section (PID: 3014 LWP 1)
stacktrace:
        ff36ace4
        ff36ab1c
        fe4e5dd4
        fe6153ec
        fe5972c0
        fe4525a4
        fe4555a0
        7cbf4
        79d98
        fe7c09b4
        fe57f7c4
        fe57f454
        fe5933cc
        fe59a244
        fe451374
        fe44cd7c
        7cbf4
        79d98
        79d98
        79d98
        79d54
        fe7c09b4
        fe57f7c4
        fe58e41c
        fe58e2ac
        fe58e234
        fe58e040
        fe57dea4
        ff37bc08
        fe57de84


testcase is in:
/net/kelvin.uk/export/tray_1/calls/417267/testcase/lib
run 'java TablePanel' then pull slider to the right & enter
a directory in the 'path' textfield. if you don't get the
'libthread panic' then you'll need to exit the program &
start again (happens most times on my solaris 8 machine)

this doesn't happen with Runtime.exec(String[], String[]);
but they need to use the one which takes a directory argument.


src for these classes is under:
/net/kelvin.uk/export/tray_1/calls/417267/testcase/src

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: 1.3.1_03 1.4.0_01 hopper FIXED IN: 1.3.1_03 1.4.0_01 hopper INTEGRATED IN: 1.3.1_03 1.4.0_01 hopper
14-06-2004

EVALUATION ###@###.### 2001-10-15 The problem is happening because there is a malloc call after a fork1() and before the exec(). This should not be done. Man page for fork1 has this information . . . fork() Safety If a Solaris threads application calls fork1() or a POSIX threads application calls fork(), and the child does more than simply call exec(), there is a possibility of deadlock occurring in the child. The application should use pthread_atfork(3THR) to ensure safety with respect to this deadlock. A Solaris threads application must explicitly link with -lpthread to access pthread_atfork(). Should there be any outstanding mutexes throughout the process, the application should call pthread_atfork() to wait for and acquire those mutexes prior to calling fork() or fork1(). See "MT-Level of Libraries" on the attributes(5) manual page. . . In this case, it is appearing as a libthread panic because when the child issues a malloc, the lock owner is being checked in 'libthread' Since there is no lock owner in the child, libthread panics. Fix is to move the malloc() before the fork1() call and do the free() in the parent again. free() need not be done in the child because it overlays itself by calling exec().
11-06-2004

SUGGESTED FIX File is 'UNIXProcess_md.c.solaris' and it is in j2se/src/solaris/native/java/lang sccs diffs -C UNIXProcess_md.c.solaris ------- UNIXProcess_md.c.solaris ------- *** /tmp/sccs.Bea4VF Mon Oct 15 18:16:42 2001 --- UNIXProcess_md.c.solaris Tue Oct 9 18:00:13 2001 *************** *** 201,206 **** --- 201,207 ---- int cmdlen, envlen = 0; char fullpath[MAXPATHLEN+1]; int i, j; + char *cwd = NULL; if (initFieldIDs(env, process, stdin_fd) != 0) return -1; *************** *** 280,285 **** --- 281,294 ---- goto cleanup4; } + /* + * Can't do this after the fork1 in child, because child can deadlock + * on locks if they are held in parent + */ + if (path != NULL) { + cwd = (char *)JNU_GetStringPlatformChars(env, path, NULL); + } + resultPid = fork1(); if (resultPid < 0) { JNU_ThrowIOException(env, strerror(errno)); *************** *** 294,304 **** /* Child process */ int i, max_fd; - char *cwd = NULL; - if (path != NULL) { - cwd = (char *)JNU_GetStringPlatformChars(env, path, NULL); - } - /* 0 open for reading, 1 open for writing */ /* (Note: it is possible for fdin[0] == 0 - 4180429) */ dup2(fdin[0], 0); --- 303,308 ---- *************** *** 329,334 **** --- 333,341 ---- } /* parent process */ + + if (cwd != NULL) + free (cwd); (*env)->SetIntField(env, stdin_fd, field_fd, fdin[1]); (*env)->SetIntField(env, stdout_fd, field_fd, fdout[0]); (*env)->SetIntField(env, stderr_fd, field_fd, fderr[0]);
11-06-2004