JDK-6468054 : Solaris x86 : JVM Crash in JNDI API
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.4.2_12,6
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris,solaris_10
  • CPU: x86,sparc
  • Submitted: 2006-09-07
  • Updated: 2014-02-27
  • Resolved: 2006-10-04
Related Reports
Relates :  
Description
The oid Oracle identiy server C binary creates JVM (binary is linked with libjvm.so)  using JNI intefrace. We are seeing  JVM crash on this particular platform in some JNDI APIs. These APIs are part of  JDK distribution.
Can we get help from Sun Java team to find out what is going wrong in this API? . The same java class works fine all other platforms (all vendors jdk) including Sparc64 .  We found that BasicAttributes.put(null), BasicAttributes.get(null) like
method calls crash JVM.

Here is  the java stack,  heap stack :

_*1. Java version :*_

$ java -version
java version "1.4.2_12"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_12-b03)
Java HotSpot(TM) Client VM (build 1.4.2_12-b03, mixed mode)

------------------------------------------------------------------------
*_2. Core stack :_*

#0  0xfdefe057 in _lwp_kill () from /lib/libc.so.1
#1  0xfdefb80b in thr_kill () from /lib/libc.so.1
#2  0xfdeaae3b in raise () from /lib/libc.so.1
#3  0xfde8e889 in abort () from /lib/libc.so.1
#4  0xfee644f2 in __1cCosFabort6Fi_v_ ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
#5  0xfeeb37ae in __1cHVMErrorOreport_and_die6M_v_ ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
#6  0xfed8e375 in __1cMreport_fatal6Fpkci1_v_ ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
#7  0xfed4f937 in __1cIRuntime1bCreturn_address_for_exception6F_pC_ ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
#8  0xf8c6cd16 in ?? ()
#9  0xfd68d40c in ?? ()
#10 0xf4a270a6 in ?? ()
#11 0xfd68d3f0 in ?? ()
#12 0xfd68d3c0 in ?? ()
#13 0x00000000 in ?? ()
...
#80 0xf0ef58d8 in ?? ()
#81 0xf0ef58f8 in ?? ()
#82 0xf0a46fc0 in ?? ()
#83 0xfef28000 in ?? ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
#84 0x0b0d5580 in ?? ()
#85 0x00000002 in ?? ()
#86 0xfd68d57c in ?? ()
#87 0xfec7eb67 in __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ ()
  from /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
------------------------------------------------------------------------

_*3. Heap dump :*_

#
# An unexpected error has been detected by HotSpot Virtual Machine:
#
#  Internal Error (43113F32554E54494D45110E435050030A), pid=3807, tid=8
#
# Java VM: Java HotSpot(TM) Client VM (1.4.2_12-b03 interpreted mode)

---------------  T H R E A D  ---------------

Current thread (0x0b0d5580):  JavaThread "main" [_thread_in_Java, id=8]

Stack: [0xfd596000,0xfd696000),  sp=0xfd68d2a4,  free space=988k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x2b3667]
V  [libjvm.so+0x18e375]
V  [libjvm.so+0x14f937]
v  ~RuntimeStub::implicit_null_exception Runtime1 stub
j  javax.naming.directory.BasicAttributes.get(Ljava/lang/String;)Ljavax/naming/directory/Attribute;+19
j  oracle.ldap.ospf.LdapEntry.getAttribute(Ljava/lang/String;)Ljavax/naming/directory/Attribute;+5
j  whrep_addjn03.when_add_replace(Loracle/ldap/ospf/PluginDetail;)Loracle/ldap/ospf/PluginResult;+226
v  ~StubRoutines::call_stub
V  [libjvm.so+0x7eb67]
V  [libjvm.so+0x7e9a1]
V  [libjvm.so+0x7e96e]
V  [libjvm.so+0x1c0464]
V  [libjvm.so+0x8afec]
C  [oidldapd+0x138c72]  sgslpip_invokeJPlg+0x36e
C  [oidldapd+0x13cea7]  sgslpad_addJPlugin+0x134
C  [oidldapd+0x13086b]  gslpwra_ExecWhenReplAddPlugin+0x208
C  [oidldapd+0x118fbc]  gslsbaAddEntry+0x35e7
C  [oidldapd+0x10c21a]  gslfadADoAdd+0x793
C  [oidldapd+0x62e6f]  gslarswWorker+0xc8c
C  [libc.so.1+0x9d02f]
C  [libc.so.1+0x9d320]


---------------  P R O C E S S  ---------------

Java Threads: ( => current thread )
 0x0b16cd88 JavaThread "CompilerThread0" daemon [_thread_blocked, id=17]
 0x0b16c190 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=16]
 0x0b168680 JavaThread "Finalizer" daemon [_thread_blocked, id=14]
 0x0b166e88 JavaThread "Reference Handler" daemon [_thread_blocked, id=13]
=>0x0b0d5580 JavaThread "main" [_thread_in_Java, id=8]

Other Threads:
 0x0b1656a0 VMThread [id=12]
 0x0b16e510 WatcherThread [id=18]

VM state:not at safepoint (normal execution)

VM Mutex/Monitor currently owned by a thread: None

Heap
def new generation   total 576K, used 356K [0xf0a00000, 0xf0aa0000, 0xf0ee0000)
 eden space 512K,  57% used [0xf0a00000, 0xf0a49380, 0xf0a80000)
 from space 64K,  99% used [0xf0a80000, 0xf0a8fff8, 0xf0a90000)
 to   space 64K,   0% used [0xf0a90000, 0xf0a90000, 0xf0aa0000)
tenured generation   total 1408K, used 182K [0xf0ee0000, 0xf1040000, 0xf4a00000)
  the space 1408K,  12% used [0xf0ee0000, 0xf0f0d9a8, 0xf0f0da00, 0xf1040000)
compacting perm gen  total 4096K, used 1315K [0xf4a00000, 0xf4e00000, 0xf8a00000)
  the space 4096K,  32% used [0xf4a00000, 0xf4b48ed0, 0xf4b49000, 0xf4e00000)

Dynamic libraries:
0x08050000      oidldapd
0xfec00000      /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjvm.so
0xfef80000      /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libjava.so
0xfebd0000      /ade/shvaramb_im1014b/oracle/jdk/jre/lib/i386/libverify.so
0xfef70000      /lib/libthread.so.1
0xfe000000      /ade/shvaramb_im1014b/oracle/lib/libclntsh.so.10.1
0xfea80000      /ade/shvaramb_im1014b/oracle/lib/libnnz10.so
0xfdf70000      /lib/libnsl.so.1
0xfea60000      /lib/libsocket.so.1
0xfdf50000      /lib/libgen.so.1
0xfefd0000      /lib/libdl.so.1
0xfdf30000      /lib/libkstat.so.1
0xfde60000      /lib/libc.so.1
0xfddf0000      /lib/libm.so.2
0xfddc0000      /usr/lib/libCrun.so.1
0xfdda0000      /lib/libm.so.1
0xfea40000      /usr/lib/libsched.so.1
0xfdd60000      /lib/libaio.so.1
0xfdd40000      /lib/librt.so.1
0xfdd20000      /lib/libmd5.so.1
0xfd270000      /lib/libmp.so.2
0xfd240000      /lib/libscf.so.1
0xfd210000      /lib/libdoor.so.1
0xfcfe0000      /lib/libuutil.so.1
0xfcfc0000      /project/as10g/qa1/shvaramb/views/view_storage/shvaramb_im1014b/jdk/jre/lib/i386/native_th
reads/libhpi.so
0xfcf80000      /project/as10g/qa1/shvaramb/views/view_storage/shvaramb_im1014b/jdk/jre/lib/i386/libzip.so
0xfad00000      /usr/lib/locale/en_US.ISO8859-1/en_US.ISO8859-1.so.3

VM Arguments:
jvm_args: -Djava.compiler=NONE -Doraclehome=/ade/shvaramb_im1014b/oracle -Xusealtsigs
java_command: <unknown>
Launcher Type: generic

Environment Variables:
JAVA_HOME=/ade/shvaramb_im1014b/oracle/jdk
...

---------------  S Y S T E M  ---------------

OS:                          Solaris 10 3/05 s10_74L2a X86
          Copyright 2005 Sun Microsystems, Inc.  All Rights Reserved.
                       Use is subject to license terms.
                           Assembled 22 January 2005

uname:SunOS 5.10 Generic_118844-08 i86pc  (T2 libthread)
rlimit: STACK infinity, CORE infinity, NOFILE 65536, AS infinity
load average:0.30 0.29 0.32

CPU:total 2 family 15, cmov, cx8, fxsr, mmx, sse, sse2

Memory: 4k page, physical 4127784k(3277592k free)

vm_info: Java HotSpot(TM) Client VM (1.4.2_12-b03) for solaris-x86, built on May  9 2006 12:47:44 by unkno
wn with Workshop 5.2 compat=5
------------------------------------------------------------------------

Comments
EVALUATION This bug is caused by 6374692, HotSpot relies on SEGVs for its operation, and when dereferencing a stack pointer we dont get it, leading to data corruption. Looking at the truss dump, Oracle applications sets the ulimit -s (Stack size) to "unlimited" programmatically, when the users ulimit is below some "magic" threshold, thus triggering the bug: 6374692. See the truss output below, Bumping the users ulimit -s to "65536" will not cause the application to increase the limit to RLIM_INFINITY. 12455: getrlimit(RLIMIT_STACK, 0x08047748) = 0 12455: cur = 10485760 max = RLIM_INFINITY 12455: setustack(0xFE322060) 12455: getrlimit(RLIMIT_DATA, 0x08046B30) = 0 12455: cur = RLIM_INFINITY max = RLIM_INFINITY 12455: setrlimit(RLIMIT_DATA, 0x08046B30) = 0 12455: cur = RLIM_INFINITY max = RLIM_INFINITY 12455: getrlimit(RLIMIT_STACK, 0x08046B30) = 0 12455: cur = 10485760 max = RLIM_INFINITY <------- 12455: setrlimit(RLIMIT_STACK, 0x08046B30) = 0 12455: cur = RLIM_INFINITY max = RLIM_INFINITY <-------
04-10-2006

EVALUATION The application is crashing with a SEGV apparently caused by the program counter of the topmost frame being set to a corrupt value. A bogus instruction is executed which dereferences a register containing the value 0x0 which results in the segmentation violation. The faulty program counter is not part of the JVM's generated code, nor is it associated with any of the native code in the application. It seems that the stack was corrupted, most likely by faulty native code, resulting in a crash upon return from the native code. The stack trace from dbx below the point of the crash is as follows: ---- called from signal handler with signal 11 (SIGSEGV) ------ [11] 0x7ffa71a(0x0, 0x0, 0xf2684580, 0xfce7d294, 0xf680b159, 0xfce7d2c0), at 0x7ffa71a [12] 0xfa803213(0x0, 0xf26847f0, 0xfce7d2c4, 0xf681dd03, 0xfce7d324, 0xf681f6d0), at 0xfa803213 [13] 0xfa802d37(0x0, 0xf26f0590, 0xf26f0670, 0xf2684cd0, 0xf26f0578, 0xf2687728), at 0xfa802d37 [14] 0xfa80025d(0xfce7d3ac, 0xfce7d6b4, 0xa, 0xf681ddb0, 0xfa80b840, 0xfce7d590, 0x2, 0xb0e1400), at 0xfa80025d [15] JavaCalls::call_helper(result = 0xfce7d6b0, m = 0xfce7d4bc, args = 0xfce7d588, __the_thread__ = 0xb0e1400), line 369 in "javaCalls.cpp" [16] os::os_exception_wrapper(f = 0xfe79ca70 = &JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*), value = 0xfce7d6b0, method = 0xfce7d4bc, args = 0xfce7d588, thread = 0xb0e1400), line 3536 in "os_solaris.cpp" [17] JavaCalls::call(result = 0xfce7d6b0, method = CLASS, args = 0xfce7d588, __the_thread__ = 0xb0e1400), line 284 in "javaCalls.cpp" [18] jni_invoke_nonstatic(env = 0xb0e1508, result = 0xfce7d6b0, receiver = 0xb18aba8, call_type = JNI_VIRTUAL, method_id = 0xb18aadc, args = 0xfce7d690, __the_thread__ = 0xb0e1400), line 1054 in "jni.cpp" [19] jni_CallObjectMethod(env = 0xb0e1508, obj = 0xb18aba8, methodID = 0xb18aadc, ...), line 1311 in "jni.cpp" [20] sgslpip_invokeJPlg(0xb0e1508, 0xb1a2ba0, 0xab84960, 0x6, 0x6, 0x0, 0xfce7d800, 0xfce7dc54, 0xfce7d7a8, 0xfce7d7ac), at 0x818997b [21] sgslpad_addJPlugin(0x832eb20, 0x956f8c8, 0xb0894e0, 0xb0c0630, 0xb0c2390, 0xb0c1788, 0x0, 0x0, 0x0, 0x0, 0xab84960, 0x6, 0x0, 0xfce7d800, 0xfce7dc54), at 0x818decf [22] gslppra_ExecPreAddPlugin(0x832eb20, 0x956f8c8, 0xaa64ac8, 0xb0894e0, 0xfce820a4, 0xfce820ec, 0xfce7dcb0, 0xfce7de54, 0xb0aa98c, 0xfce82074, 0xfce820c4, 0xfce820c0, 0xfce8209c), at 0x8180ecb [23] gslsbaAddEntry(0x832eb20, 0x956f8c8, 0xaa64ac8, 0xb0894e0, 0xfce8372c), at 0x81668be [24] gslfadADoAdd(0x832eb20, 0xaa64ac8, 0xb0894e0), at 0x815cc9a [25] gslarswWorker(0x940dcf0), at 0x80b2fbf [26] _thr_setup(0xfd291400), at 0xfd70f9ae [27] _lwp_start(), at 0xfd70fc90 The disassembly of the faulty program counter in frame 11 is as follows: 0x07ffa71a: orb %ah,%cs:0x00000062(%eax) This is quite clearly a random piece of memory, since it appears to be a nonsense instruction, although it is not clear from where the address of this instruction came. Frames 12 and 13 correspond to interpreted Java methods. We have used the Java HotSpot Serviceability Agent to extract a stack trace from them which is: oracle/ldap/ospf/LdapEntry.getAttributes() pre_addjn17.pre_add() At this point the most likely hypothesis is that these methods called some native code which corrupted the stack, and that upon return from that native code execution resumed in a random location. This hypothesis is based on the fact that the crash occurs identically with two drastically different versions of the JDK (1.4.2_12 and Java SE 6) while the Java HotSpot compilers are completely disabled, that there is a substantial amount of native code (roughly 26 MB worth) linked in to the Java application, and that the program counter is pointing outside both all of the JVM's generated assembly code regions and outside all loaded DSOs. Since the crash occurs relatively quickly upon start of the process, the best suggestion we can offer is to add tracing code to the application's native code to see what native methods are executed before the crash occurs. If it is possible to iteratively comment out the bodies of these native methods one by one and return null from them without significantly perturbing the execution of the application, then iterative searching should turn up the native method whose execution yields the crash afterward. Kenneth Russell and Kumar Srinivasan
29-09-2006