JDK-6906488 : one pkcs11 operation is done by multiple different threads which causes trouble
  • Type: Bug
  • Component: security-libs
  • Sub-Component: javax.crypto:pkcs11
  • Affected Version: solaris_10u9,6u13,6u15,6u16,6u17
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris_10
  • CPU: generic,x86,sparc
  • Submitted: 2009-12-02
  • Updated: 2014-04-03
  • Resolved: 2011-05-03
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7
6-poolResolved 7Resolved
Related Reports
Duplicate :  
Relates :  
Relates :  
Description
The original reported problem was coredump of Java in the pkcs#11 native libraries:

(dbx) where
current thread: t@1685
=>[1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffbffeff, 0x0), at 0xff2cc674
  [2] raise(0x6, 0x0, 0xff334f18, 0xff2abf30, 0xffffffff, 0x6), at 0xff265a74
  [3] abort(0x2cfc8, 0x1, 0xfeb00ab0, 0xeeb60, 0xff3333d8, 0x0), at 0xff24194c
  [4] os::abort(0x1, 0xfedca58c, 0x1, 0xfedb2000, 0x1858c, 0x18400), at 0xfeaf67b4
  [5] VMError::report_and_die(0xfeded4a8, 0x0, 0x1, 0xfed6095b, 0xfed67006, 0xfedf2ce8), at 0xfec088d8
  [6] JVM_handle_solaris_signal(0xb, 0xb17fcc98, 0xb17fc9e0, 0xafc00, 0xa5c800, 0x2f3a5), at 0xfe5b91e8
  [7] __sighndlr(0xb, 0xb17fcc98, 0xb17fc9e0, 0xfe5b8724, 0x0, 0x1), at 0xff2c8a94
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [8] arcfour_crypt(0x0, 0x12, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea8), at 0xfbd58404
  [9] soft_arcfour_crypt(0x2cdb50, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea4, 0x116), at 0xfbd3f288
  [10] C_EncryptUpdate(0x3, 0xb17fdea8, 0x12, 0xb17fcea4, 0xb17fdea4, 0x7), at 0xfbd364f0
  [11] Java_sun_security_pkcs11_wrapper_PKCS11_C_1EncryptUpdate(0xa5c910, 0x12, 0x2cdae8, 0xfbff6a80, 0x0, 0x0), at 0xfe025abc
  [12] 0xfc00d4a0(0x9ff1, 0xb17ff064, 0xb17fefc0, 0xffffff68, 0x4b067f88, 0x0), at 0xfc00d4a0
  [13] 0xfc00d44c(0xbb816510, 0xe6ec09a0, 0x0, 0x34, 0xe6ec09a0, 0xb17ff028), at 0xfc00d44c
  [14] 0xfc372194(0xc14dc2b0, 0xe6ec09a0, 0x5, 0x12, 0xe6ec09a0, 0x5), at 0xfc372194
  [15] 0xfc0059d0(0xc14dc2b0, 0xe6ec09a0, 0xb82f58d0, 0xfc017228, 0xe6ec09a0, 0xb17ff160), at 0xfc0059d0
  [16] 0xfc3abd3c(0xb81f3208, 0xc14dc278, 0xb81f2c50, 0xfc016470, 0xe6ecd850, 0xb17ff1d0), at 0xfc3abd3c
  [17] 0xfc005868(0xc14dba58, 0xb7, 0x0, 0xfc019da0, 0xfc274980, 0xb17ff268), at 0xfc005868
  [18] 0xfc005868(0xc14dba58, 0xb17ff3e8, 0x0, 0xfc017228, 0x1, 0xb17ff300), at 0xfc005868
  [19] 0xfc005868(0xc14dba58, 0x18, 0x0, 0xfc019da0, 0xb79cefb0, 0xb17ff390), at 0xfc005868
  [20] 0xfc005868(0xc14dba58, 0xb7, 0x0, 0xfc019da0, 0x58c00, 0xb17ff420), at 0xfc005868
  [21] 0xfc005868(0xc14dba58, 0xe6ec0868, 0x0, 0xfc017228, 0x2, 0xb17ff4c8), at 0xfc005868
  [22] 0xfc275f58(0xffffffff, 0xc14eed28, 0x0, 0x2000, 0x13, 0xb820ee20), at 0xfc275f58
  [23] 0xfc33d00c(0xc14dbb28, 0xfedec370, 0x3a36c, 0x0, 0x0, 0xfede13a5), at 0xfc33d00c
  [24] 0xfc2e54cc(0xc14dbb28, 0xc3a59658, 0x0, 0x1, 0x4, 0x0), at 0xfc2e54cc
  [25] 0xfc1b1e84(0x0, 0xc3a59658, 0x0, 0x1, 0xf069e9d0, 0x3a400), at 0xfc1b1e84
  [26] 0xfc26c590(0xc3a59658, 0xb8142320, 0xc14dbb48, 0x8, 0xff1e8000, 0x0), at 0xfc26c590
  [27] 0xfc005d88(0xb17fffa0, 0x3a370, 0x0, 0xfc0174a0, 0xe64b8b50, 0xb17ff760), at 0xfc005d88
  [28] 0xfc00021c(0xb17ff84c, 0xb17ffaf8, 0xa, 0xb782ec30, 0xfc00b3c0, 0xb17ff9e0), at 0xfc00021c
  [29] JavaCalls::call_helper(0xfc0001c0, 0xa5c800, 0x1, 0x58c3d8, 0xb782ec30, 0xb17ffaf8), at 0xfe551b94
  [30] JavaCalls::call_virtual(0xb17ffaf0, 0x58c3dc, 0x58c3e8, 0x860400, 0xb17ff9d8, 0xff79f98c), at 0xfe8ec220
  [31] JavaCalls::call_virtual(0xb17ffaf0, 0xb17ffaec, 0xb17ffae8, 0xb17ffae4, 0xb17ffae0, 0x58c3dc), at 0xfe5e5704
  [32] thread_entry(0xb7830cf8, 0xa5c800, 0x4d400, 0xfedffa48, 0xfedff7d4, 0xfedff524), at 0xfe5f8784
  [33] JavaThread::thread_main_inner(0xa5c800, 0x1d6ca8, 0x695, 0xb, 0xfedb2000, 0x0), at 0xfebb5318
  [34] java_start(0xa5c800, 0xb7d, 0xfedb2000, 0xfed00079, 0x5131e8, 0xfedfb3f4), at 0xfeaf5910

At the very first glance this looks like a problem in the native pkcs11 libraries outside Java but this is only one part of the problem here. The libraries are called with an invalid session and this causes the coredump. Bug 6905996 was filed to address this issue and to detect this bogus argument to avoid the coredump.

The root cause however is different and was captured using this dtrace script:

-----------------------------------------------------------------------------
#!/usr/sbin/dtrace -s

BEGIN {
printf("Target pid: %d\n", $target);
}

long active_session[long];

pid$target::C_EncryptInit:entry
{
  printf("session = %li (tid=%li)", arg0, (long)tid);
  active_session[arg0] = tid;
}

pid$target::C_EncryptInit:return
{ }

pid$target::C_Encrypt:entry
{
  self->my_session = arg0;
}

pid$target::C_Encrypt:return
{
  active_session[self->my_session] = 0;
  self->my_session = 0;
}

pid$target::C_EncryptUpdate:entry
/ active_session[arg0] != tid /
{
  printf("\nError:\n");
  printf("session = %li (owner = %li / caller = %li)\n", arg0, (long)active_session[arg0], (long)tid);
}

/*
pid$target::C_EncryptUpdate:return
{
}
*/

pid$target::C_EncryptFinal:entry
{
  self->my_session = arg0;
}

pid$target::C_EncryptFinal:return
{
  active_session[self->my_session] = 0;
  self->my_session = 0;
}
-----------------------------------------------------------------------------

Output of a test with an affected application (tomcat server using JRE 6.0_17-b04:

-----------------------------------------------------------------------------
CPU     ID                    FUNCTION:NAME
 17      1                           :BEGIN Target pid: 27662

  2 111400              C_EncryptInit:entry session = 19233912 (tid=48)
  2 111402             C_EncryptInit:return
  2 111401              C_EncryptInit:entry session = 19233912 (tid=48)
  2 111403             C_EncryptInit:return
  0 111408            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 42)

  0 111409            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 42)

  0 111408            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 33)

  0 111409            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 33)

  0 111408            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 33)

  0 111409            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 22 / caller = 33)

  0 111408            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 0 / caller = 1364)

  0 111409            C_EncryptUpdate:entry
Error:
session = 6791712 (owner = 0 / caller = 1364)

 16 111400              C_EncryptInit:entry session = 6791712 (tid=22)
 16 111402             C_EncryptInit:return
 16 111401              C_EncryptInit:entry session = 6791712 (tid=22)
 16 111403             C_EncryptInit:return
dtrace: pid 27662 has exited
-----------------------------------------------------------------------------

The problem is that one thread is starting the crypto operation but another thread is continuing with the encryption. This violates the PKCS#11 standard (e.g. see section 6.7.6 of ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-11/v2-30/pkcs-11v2-30b-d6.pdf) where a single operation must not be used simulatneous by different threads. Everything between the C_EncryptInit() and C_EncryptFinal() is one operation (even if in multiple parts) and must be run on the same thread.

If multiple threads are used in parallel then each thread must have it's own session to be safe.

In the example crash shown above the PCKS#11 functions have been called with this argument showing the broken session (context is the NULL pointer):

> 0x002cdb50::print -t crypto_active_op_t
{
    CK_MECHANISM mech = {
        CK_MECHANISM_TYPE mechanism = 0x111	(CKM_RC4)
        CK_VOID_PTR pParameter = 0
        CK_ULONG ulParameterLen = 0
    }
    void *context = 0
    uint32_t flags = 0
}
CRs 7025227,6932403 may be a big factor on the issues seen here. Improper (early) disposal of the Ciperlocks at SSL fatal call times and at closeSocket calls could lead to consequenses in the underlying Solaris native library calls. Improvements have been made to the SSLSocketImpl class as part of 7024697 & 7001094 fixes and initial (early) testing has shown no exceptions/crashes.
Updated bug to reflect the correct bug IDs. - root cause is possibly linked to : CRs 7025227,6932403  (NOT 7024697,7001094)

Comments
EVALUATION Further testing from various submitters has shown that fixes for below CRs solves the pkcs11 issues being reported : 7025227 SSLSocketImpl does not close the TCP layer socket if a close notify cannot be sent to the peer 6932403 SSLSocketImpl state issue I'm closing this bug as a duplicate of 7025227 (which also addresses the timing for disposal of Cipher objects)
03-05-2011

WORK AROUND Disable the pkcs11 provider in Java (by commenting out the default provider in lib/security/java.security pointing to the pkcs11 provider). This however has the drawback that hardware crypto accelerators will no longer be used.
02-12-2009