JDK-8206334 : With heavy load of SSL handshakes, the application gets stuck on java.security.Provider.getService(String, String)
  • Type: Bug
  • Component: security-libs
  • Sub-Component: java.security
  • Affected Version: 8,9,10,11
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • OS: linux
  • CPU: x86_64
  • Submitted: 2018-06-28
  • Updated: 2018-09-12
  • Resolved: 2018-07-18
Related Reports
Duplicate :  
Description
ADDITIONAL SYSTEM INFORMATION :
Ubuntu 16.04.3 LTS
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

A DESCRIPTION OF THE PROBLEM :
With heavy load of many ssl handshakes using apache http-client, the application hangs. Here is a link to JFR snapshot https://www.dropbox.com/s/cpgx406vkqpvc80/2018-06-25_08-11-23.jfr.zip?dl=0 where 

java.security.Provider.getService(String, String)
   sun.security.jca.ProviderList$ServiceList.tryGet(int)
      sun.security.jca.ProviderList$ServiceList.access$200(ProviderList$ServiceList, int)
         sun.security.jca.ProviderList$ServiceList$1.hasNext()

average call is 5.5 sec. 

Stack Trace	Count	Duration
java.security.Provider.getService(String, String)	34,119	167,513,413,751,896
   sun.security.jca.ProviderList$ServiceList.tryGet(int)	17,642	87,721,088,998,037
      sun.security.jca.ProviderList$ServiceList.access$200(ProviderList$ServiceList, int)	17,642	87,721,088,998,037
         sun.security.jca.ProviderList$ServiceList$1.hasNext()	17,642	87,721,088,998,037
            javax.crypto.KeyGenerator.nextSpi(KeyGeneratorSpi, boolean)	6,009	29,975,333,754,681
               javax.crypto.KeyGenerator.<init>(String)	6,009	29,975,333,754,681
                  javax.crypto.KeyGenerator.getInstance(String)	6,009	29,975,333,754,681
                     sun.security.ssl.JsseJce.getKeyGenerator(String)	6,009	29,975,333,754,681
                        sun.security.ssl.HandshakeMessage$Finished.getFinished(HandshakeHash, int, SecretKey)	
sun.security.ssl.HandshakeMessage$Finished.<init>(ProtocolVersion, HandshakeHash, int, SecretKey, CipherSuite)	1,582	7,560,092,837,932
sun.security.ssl.ClientHandshaker.sendChangeCipherAndFinish(boolean)	1,061	
sun.security.ssl.ClientHandshaker.serverHelloDone(HandshakeMessage$ServerHelloDone)	
sun.security.ssl.ClientHandshaker.processMessage(byte, int)
sun.security.ssl.Handshaker.processLoop()
sun.security.ssl.Handshaker.process_record(InputRecord, boolean)
sun.security.ssl.SSLSocketImpl.readRecord(InputRecord, boolean)
sun.security.ssl.SSLSocketImpl.performInitialHandshake()
sun.security.ssl.SSLSocketImpl.startHandshake(boolean)
sun.security.ssl.SSLSocketImpl.startHandshake()                                                         org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(Socket, String, int, HttpContext)	



STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
each time after 3 days of application work. 


CUSTOMER SUBMITTED WORKAROUND :
restart application. 

FREQUENCY : always



Comments
Additional Information from submitter: Refer to the below test case. If this test case is run in a profiler, you will notice that threads are all red (i.e. blocked), some for several minutes. The code needs to be refactored so that it doesn't serve as a bottleneck on high performance Java applications, reliant on common cryptography operations. package com.stimulus.util; import org.junit.Test; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; /** * Created by Valentin Popov on 04/07/2018. */ public class Cipher { @Test public void runs() throws InterruptedException { int tasks = 1000; ExecutorService executorService = Executors.newFixedThreadPool(tasks); for (int i = 0; i < tasks; i++) { executorService.submit(new Getter()); } executorService.awaitTermination(1, TimeUnit.HOURS); executorService.shutdown(); } private static class Getter implements Runnable { @Override public void run() { int i = 0; do { try { javax.crypto.Cipher.getInstance("RSA/ECB/OAEPWithSHA-256AndMGF1Padding"); } catch (Exception e) { } if (i > 1000000) break; } while (true); } } }
17-07-2018

From submitter: Thanks for response. We have no such problems with previous, because we have no such load on our application. And right now it is not possible to move application to JRE9 or JRE10 ��� alot of changes and looks like it will not help, as source code for 9, 10 similar to JRE8 in this part. We tried switch to https://www.bouncycastle.org, but unfortunately them use same call to getProvider() inside lib. I can���t provide test case, as it happens only on production environment after couple of days of work. here is after application running for 2 days. Here is a snapshot https://www.dropbox.com/s/qvcvxcuyc2mhc19/2018-07-04_09-14-13.jfr?dl=0
17-07-2018

To submitter: Have you started observing this issue with JDK 8u172 or it was the same with the earlier JDK versions as well ? Can you please provide a test case that may help reproduce the issue at our end. Please verify with the latest JDK versions like JDK 10.0.1 and let us know if you still experience the same issue.
04-07-2018