JDK-8309754 : java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java failed with "Expected to pass, found 1 soft fail exceptions"
  • Type: Bug
  • Component: security-libs
  • Sub-Component: java.security
  • Affected Version: 17.0.16-oracle,21,22,23
  • Priority: P3
  • Status: In Progress
  • Resolution: Unresolved
  • OS: windows
  • CPU: x86_64
  • Submitted: 2023-06-09
  • Updated: 2025-04-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Description
The following test failed in the JDK22 CI:

java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java

Here's a snippet from the log file:

#section:main
----------messages:(7/575)----------
command: main -Dcom.sun.security.ocsp.readtimeout=2750ms OCSPTimeout 2000 true
reason: User specified action: run main/othervm -Dcom.sun.security.ocsp.readtimeout=2750ms OCSPTimeout 2000 true 
started: Fri Jun 09 21:20:32 UTC 2023
Mode: othervm [/othervm specified]
Additional options from @modules: --add-modules java.base --add-exports java.base/sun.security.x509=ALL-UNNAMED --add-exports java.base/sun.security.provider.certpath=ALL-UNNAMED --add-exports java.base/sun.security.util=ALL-UNNAMED
finished: Fri Jun 09 21:20:38 UTC 2023
elapsed time (seconds): 5.427
----------configuration:(6/279)----------
Boot Layer
  add modules: java.base                                
  add exports: java.base/sun.security.provider.certpath ALL-UNNAMED
               java.base/sun.security.util              ALL-UNNAMED
               java.base/sun.security.x509              ALL-UNNAMED

----------System.out:(19/658)----------
Generated Root CA KeyPair
Generated End Entity KeyPair
Root CA Created:
Issuer: CN=Root CA Cert, O=SomeCompany
Subject: CN=Root CA Cert, O=SomeCompany
Serial: 1

[MainThread]: nextUpdate set to Fri Jun 09 22:20:34 UTC 2023
[MainThread]: Response Content-Length field disabled
[pool-1-thread-1]: Listening on 0.0.0.0/0.0.0.0:63089
Root OCSP Responder URI is http://localhost:63089
SSL Certificate Created:
Issuer: CN=Root CA Cert, O=SomeCompany
Subject: CN=SSLCertificate, O=SomeCompany
Serial: 4096

[MainThread]: Added entry for serial 4096(CERT_STATUS_GOOD)
[MainThread]: OCSP latency set to 2000 milliseconds.
Test case expects to pass
----------System.err:(11/700)----------
java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions
	at OCSPTimeout.main(OCSPTimeout.java:116)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
	at java.base/java.lang.Thread.run(Thread.java:1583)

JavaTest Message: Test threw exception: java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions
JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions
----------rerun:(46/6462)*----------
Comments
Hi [~jnimeh], Please note that the parallelism I employed to reproduce wasn't at the thread but at the process level. Just opened 4 Cygwin Bash terminals at the JDK repository built in slowdebug and pasted something like this: ~~~ JDK_IMG="$(cygpath -am build/windows-x86_64-server-slowdebug/images/jdk)" for _ in `seq 100`; do OUT="$(cygpath -m "$(mktemp -dt test_XXXXXXXXXX)")" jtreg -jdk:$JDK_IMG -w:$OUT/work -r:$OUT/work -timeout:8 \ test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java done ~~~ So just heavy load plus more chances of hitting the issue (400 independent executions of the test).
05-12-2024

[~fferrari] Thanks for the detailed information on the replicated failure. I will try running it inside of multiple threads to see if I can reproduce it, but I will also review the info you've provided. I should probably go back and look at the OCSP server code. Since you were able to make it happen in multiple threads, I wonder if there might be some thread lock contention that contributes to the delay. I would think the hold and releases of the locks should be pretty fast, but couple that with the increase load and who knows? Anyhow, I really appreciate the detailed information.
03-12-2024

Hi [~jnimeh], I've managed to reproduce this issue (or a similar one) with a slowdebug build of the mainline repository, in a Windows 11 VM. The VM has 4 CPU cores, I parallely opened 4 terminals and ran 100 rounds of the OCSPTimeout test in each one. Out of the total 400 runs, 3 failed. I've attached the full OCSPTimeout.jtr. In order to have the full stack trace of the "soft fail exceptions" I applied the following patch: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java b/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java index dfd7dcdc81e..5a8d3faab07 100644 --- a/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java +++ b/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java @@ -114,9 +114,12 @@ public static void main(String[] args) throws Exception { if (expected) { if (!softExc.isEmpty()) { log("Expected to pass, found " + softExc.size() + - " soft fail exceptions"); + " soft fail exceptions:"); for (CertPathValidatorException cpve : softExc) { - log("Exception: " + cpve); + try (StringWriter sw = new StringWriter()){ + cpve.printStackTrace(new PrintWriter(sw)); + log("----\n%s", sw.toString()); + } } throw new RuntimeException("Expected to pass, found " + softExc.size() + " soft fail exceptions"); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ And here it is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ---- java.security.cert.CertPathValidatorException: Unable to determine revocation status due to network error at java.base/sun.security.provider.certpath.RevocationChecker.isSoftFailException(RevocationChecker.java:436) at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:381) at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:343) at java.base/sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(PKIXMasterCertPathValidator.java:125) at java.base/sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:224) at java.base/sun.security.provider.certpath.PKIXCertPathValidator.validate(PKIXCertPathValidator.java:144) at java.base/sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(PKIXCertPathValidator.java:83) at java.base/java.security.cert.CertPathValidator.validate(CertPathValidator.java:309) at OCSPTimeout.main(OCSPTimeout.java:110) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:565) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1447) Caused by: java.net.SocketTimeoutException: Read timed out at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:278) at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:304) at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346) at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796) at java.base/java.net.Socket$SocketInputStream.implRead(Socket.java:981) at java.base/java.net.Socket$SocketInputStream.read(Socket.java:968) at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:289) at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:330) at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:388) at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:827) at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:759) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1391) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1301) at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:506) at java.base/sun.security.provider.certpath.OCSP.getOCSPBytes(OCSP.java:248) at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:168) at java.base/sun.security.provider.certpath.RevocationChecker.checkOCSP(RevocationChecker.java:780) at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:369) ... 11 more ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Additionally, it seems the OCSP server had already started writing when the OCSP client read timeout occurred: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [pool-2-thread-1][1732348368333]: Status for SN 4096: CERT_STATUS_GOOD [pool-2-thread-1]: Exception: java.net.SocketException: An established connection was aborted by the software in your host machine at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method) at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:54) at java.base/sun.nio.ch.NioSocketImpl.tryWrite(NioSocketImpl.java:394) at java.base/sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:410) at java.base/sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:440) at java.base/sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:819) at java.base/java.net.Socket$SocketOutputStream.implWrite(Socket.java:1096) at java.base/java.net.Socket$SocketOutputStream.write(Socket.java:1083) at java.base/java.io.OutputStream.write(OutputStream.java:124) at sun.security.testlibrary.SimpleOCSPServer$OcspHandler.sendResponse(SimpleOCSPServer.java:831) at sun.security.testlibrary.SimpleOCSPServer$OcspHandler.run(SimpleOCSPServer.java:793) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:545) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:328) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619) at java.base/java.lang.Thread.run(Thread.java:1447) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It seems like the 5s read timeout (-Dcom.sun.security.ocsp.readtimeout=5) is not enough. If we analyze the times around the message "Test case expects to pass/fail", we can observe the following: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test@vmwinwork:~$ grep -C1 'Test case expects to' OCSPTimeout.jtr [MainThread][1732348313182]: OCSP latency set to 1000 milliseconds. Test case expects to pass [pool-2-thread-1][1732348315366]: Delaying response for 1000 milliseconds. -- [MainThread][1732348360440]: OCSP latency set to 1000 milliseconds. Test case expects to pass [pool-2-thread-1][1732348367153]: Delaying response for 1000 milliseconds. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first occurrence is from the first run, with the following parameters: -Djava.security.debug=certpath OCSPTimeout 1000 true https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java#L39 The second occurrence is from the second run, with the following parameters: -Djava.security.debug=certpath -Dcom.sun.security.ocsp.readtimeout=5 OCSPTimeout 1000 true https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java#L40-L41 In each run: 1. The first timestamp is printed by the main thread, when it sets the OCSP server delay: https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java#L95 https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/testlibrary/SimpleOCSPServer.java#L547 2. The "Test case expects to pass" message is printed by the main thread immediately after. 3. The second timestamp is printed by the OCSP server thread, when a new connection is accepted, and it executes the OcspHandler runnable: https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/testlibrary/SimpleOCSPServer.java#L240 https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/testlibrary/SimpleOCSPServer.java#L730 The elapsed time between the two timestamps for the first run is: 1732348315366 - 1732348313182 = 2184 ms = 2.184 s The elapsed time between the two timestamps for the second run is: 1732348367153 - 1732348360440 = 6713 ms = 6.713 s Assuming the elapsed time between the following lines is small enough: https://github.com/openjdk/jdk/blob/jdk-24+24/test/jdk/java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java#L95-L110 We can assume the connection to the OCSP server was fired less than 1 second after the first timestamp. During the second run, for some reason, the server received or handled the connection more than 5 seconds after it was started by the client. This caused the read timeout on the client side. I would associate this with a temporary networking slowdown while Windows is virtualized and under heavy load.
23-11-2024

Stopping work on this pending the logging enhancements in JDK-8337826 and a recurrence of this in some environment somewhere with (hopefully) some additional clues as to the issue.
05-08-2024

Here's a log file snippet from the jdk-23+11-742-tier2 sighting: java/security/cert/CertPathValidator/OCSP/OCSPTimeout.java #section:main ----------messages:(7/566)---------- command: main -Dcom.sun.security.ocsp.readtimeout=5 OCSPTimeout 1000 true reason: User specified action: run main/othervm -Dcom.sun.security.ocsp.readtimeout=5 OCSPTimeout 1000 true started: Fri Feb 16 09:24:07 UTC 2024 Mode: othervm [/othervm specified] Additional options from @modules: --add-modules java.base --add-exports java.base/sun.security.x509=ALL-UNNAMED --add-exports java.base/sun.security.provider.certpath=ALL-UNNAMED --add-exports java.base/sun.security.util=ALL-UNNAMED finished: Fri Feb 16 09:24:36 UTC 2024 elapsed time (seconds): 28.829 ----------configuration:(6/279)---------- <snip> ----------System.out:(24/919)---------- Generated Root CA KeyPair Generated End Entity KeyPair Root CA Created: Issuer: CN=Root CA Cert, O=SomeCompany Subject: CN=Root CA Cert, O=SomeCompany Serial: 1 [MainThread]: nextUpdate set to Fri Feb 16 10:24:09 UTC 2024 [MainThread]: Response Content-Length field disabled [pool-1-thread-1]: Listening on 0.0.0.0/0.0.0.0:50947 Root OCSP Responder URI is http://localhost:50947 SSL Certificate Created: Issuer: CN=Root CA Cert, O=SomeCompany Subject: CN=SSLCertificate, O=SomeCompany Serial: 4096 [MainThread]: Added entry for serial 4096(CERT_STATUS_GOOD) [MainThread]: OCSP latency set to 1000 milliseconds. Test case expects to pass [pool-2-thread-1]: Received incoming HTTP GET from /127.0.0.1:50949 [pool-2-thread-1]: ReqHdr: User-Agent: Java/23-ea [pool-2-thread-1]: ReqHdr: Host: localhost:50947 [pool-2-thread-1]: ReqHdr: Accept: */* [pool-2-thread-1]: ReqHdr: Connection: keep-alive ----------System.err:(11/700)---------- java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions at OCSPTimeout.main(OCSPTimeout.java:116) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1575) JavaTest Message: Test threw exception: java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions JavaTest Message: shutting down test STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Expected to pass, found 1 soft fail exceptions ----------rerun:(47/6592)*---------- This bug was previous closed as a duplicate of: JDK-8309740 Expand timeout windows for tests in JDK-8179502 However, that fix was integrated in jdk-21+29 and this failure matches the original failure mode for this bug. So I have reopened this bug and added the new sighting.
16-02-2024

Duplicate of JDK-8309740
12-06-2023

I will close this as a duplicate of JDK-8309740, which covers this test along with two similar tests that were introduced as part of JDK-8179502. The SAP folks were seeing all three intermittently fail. I had hoped the time windows would be long enough but apparently not in certain cases. I have a proposed patch which the SAP folks are trying out that extends the time windows so they are all at least 1.5 seconds to allow the test endpoints to reach their proper states.
12-06-2023