JDK-6850585 : JVM crash due to growing native heap caused by pkcs11 related operations
  • Type: Bug
  • Component: security-libs
  • Sub-Component: javax.crypto:pkcs11
  • Affected Version: 6u14
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_10
  • CPU: sparc
  • Submitted: 2009-06-12
  • Updated: 2010-05-10
  • Resolved: 2009-09-10
Related Reports
Duplicate :  
Relates :  
Description
This is IBIS case 71036960

Is this issue same as CR 6812738 and 6750401?

In the following test, 5 data sets were collected on the same system.
set 1 resulted in a JVM crash due to OOM out of swap space, then
server was restarted. For set 2 through set 5, manual GC were forced
in between in attempted to clear up pending finalizers which might
hold up native memory.

Data sets A: /net/cores.central/cores/dir22/71036960/060409
Each data set (set1 through set5) contains core file, output
from mdb's umausers, pmap -r, gclog.  Additional hs_err log
collected for set1.

Observations

1: Monitor c heap space over a period of time at different stages of
execution, this include *both used and free space*

% grep heap */info/pmap*
set1-06042009/info/pmap.txt:00030000 630592K rwx-- [ heap ]
------------- server restarted ----------------
set2-06042009/info/pmap.txt:00030000 232960K rwx-- [ heap ]
set3-06042009/info/pmap.txt:00030000 233792K rwx-- [ heap ]
set4-06042009/info/pmap.txt:00030000 234496K rwx-- [ heap ]
set5-06042009/info/pmap.txt:00030000 236672K rwx-- [ heap ]


2: Montitor "true" c heap usage, ie used space

% cat *mdb | grep "bytes for" | awk '{ sum += $1 } END { print "total=", sum}'

set1 257897328 bytes
------------- server restarted ----------------
set2 57781328 bytes
set3 59953512 bytes
set4 56606376 bytes
set5 60171288 bytes


3: Compare c heap used space vs partition size


used space partition size
set1 ~258M ~630M
------------- server restarted ----------------
set2 ~58m ~230M
set3 ~60M ~230M
set4 ~57M ~230M
set5 ~60M ~230M


4: Monitor top allocators from set1 (where oom out of swap was seen)

From mdb umausers output, the top entries from set1 are mostly from
PKCS11 related operations.
eg

60363072 bytes for 44913 allocations with data size 1344:
libumem.so.1`umem_cache_alloc+0x13c
libumem.so.1`umem_alloc+0x60
libumem.so.1`malloc+0x28
libpkcs11.so.1`get_master_attributes_by_type+0x34
libpkcs11.so.1`get_master_attributes_by_template+0xcc
libpkcs11.so.1`meta_CreateObject+0x398
libpkcs11.so.1`C_CreateObject+0xcc

libj2pkcs11.so`Java_sun_security_pkcs11_wrapper_PKCS11_C_1CreateObject+0x7c
....

Conclusions

1. Set1 showed ~630MB c heap usage, while the others sets (set 2-5)
showed only ~230MB. JVM crashed due to out of swap OOM with set1
only, where the additional ~400M pushed the process size over
the 4G limit.

2. Since set 2-5 had implemented the force full gc hack to clean
up pending finalizers, the result confirmed that clearing up
pending finalizers help in the reduction the c heap growth.


3. Notice even for set1, the real c heap usage when JVM crash was
~258M, but the partition size was up to ~630M. What that means
is that the c heap contains ~370M of free space.
That means C heap has at one point grew all the way to ~630M, 
but later when memory was reclaimed (likely due to finalization)



Data set B: /net/cores.central/cores/dir22/71036960/0521/work/heap.bin
A study of finalizers usage from a previous heap dump shows relatively high number
of sun.security.pkcs11.P11Cipher

Referees by Type:

Class	Count
java.lang.ref.Finalizer 	62599
java.lang.Class 	24114
java.lang.ref.ReferenceQueue$Null 	14529
java.lang.ref.ReferenceQueue 	9585
sun.security.pkcs11.P11Cipher 	2016
java.net.SocksSocketImpl 	1825
weblogic.jms.frontend.FEConnection 	1594
java.util.zip.Inflater 	1371
java.util.zip.ZipFile 	1266
java.util.jar.JarFile 	306
weblogic.utils.classloaders.ClasspathClassFinder$ZipFileWrapper 	225
weblogic.connector.outbound.ConnectionWrapper 	210
java.util.concurrent.ThreadPoolExecutor 	201
weblogic.rmi.internal.LeasedRemoteRef 	150
sun.net.www.protocol.jar.URLJarFile 	83
weblogic.net.http.HttpClient 	64
com.sun.jndi.ldap.LdapCtx 	52
com.sun.jndi.ldap.LdapClient 	52
com.wily.introscope.agent.probe.io.ManagedFileOutputStream 	21
com.rsa.jsafe.JA_DES 	18
kodo.kernel.FinalizingKodoBroker 	18
com.rsa.jsafe.JA_DESKey 	18
weblogic.jms.backend.BackEnd 	15
com.sun.crypto.provider.PBEKey 	12
java.lang.ClassLoader$NativeLibrary 	12
com.rsa.jsafe.JA_3DES_EDE 	6
java.io.FileOutputStream 	6
com.rsa.jsafe.JG_BlockCipher 	6
com.rsa.jsafe.JA_PKCS5Padding 	6
com.rsa.jsafe.JA_CBC 	6
java.io.FileInputStream 	4
java.util.Timer$1 	3
com.octetstring.vde.backend.standard.RAFPool 	3
java.net.PlainDatagramSocketImpl 	2
com.wily.org.apache.log4j.ConsoleAppender 	2
sun.jdbc.odbc.JdbcOdbcDriver 	1
org.apache.log4j.net.JMSAppender 	1
com.octetstring.vde.backend.standard.TransactionProcessor 	1
com.wily.introscope.spec.server.beans.properties.PropertyNotification 	1
com.wily.introscope.agent.AutoNamingRollingFileAppender 	1
weblogic.management.mbeans.custom.Domain 	1
sun.security.pkcs11.wrapper.PKCS11 	1
weblogic.security.utils.Pool 	1
org.apache.log4j.AsyncAppender 	1
com.octetstring.vde.backend.standard.TransactionLog 	1
com.octetstring.vde.util.Logger 	1
netscape.ldap.LDAPConnection 	1