JDK-8331682 : Slow networks/Impatient clients can potentially send unencrypted TLSv1.3 alerts that won't parse on the server
  • Type: Enhancement
  • Component: security-libs
  • Sub-Component: javax.net.ssl
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2024-05-03
  • Updated: 2024-11-20
  • Resolved: 2024-11-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 24
24 b23Fixed
Related Reports
Duplicate :  
Relates :  
Description
Responding to:
https://mail.openjdk.org/pipermail/security-dev/2024-May/039423.html

This message describes a TLSv1.3 situation where a client gave up waiting for a server to respond with a serverHello, and tries to close the connection instead.  The server then fails to read the response as there is not enough data to decrypt the AES/AEAD/GCM tag, and throws an Exception.

Background:

In TLSv1.3, we correctly switch both read/write ciphers after sending the ServerHello/EE/Certs/etc./Finished flight, so the return client flight is expected to be encrypted.  Due to the timeouts, the client doesn't receive the server's keyshare, and thus can't begin encrypted transfers.

The client thus sends a plaintext Alert/close_notify (2 bytes) to indicate the attempted closure.  The server tries to decrypt the bytes provided, but the inbound data doesn't have enough bytes for a GCM label (16 bytes):  thus the exception and failure observed.  Reproducer based on SSLEngineTemplate attached.

The reason we don't see this on 1.2 is that we switch ciphers on the server AFTER the client's return flight, so the Alert/close_notify will be successfully parsed if the stall happens in the same place.

One option:  on decrypt failure on the initial client return flight, we could try a single unencrypted parse to see if it's an Alert/close_notify|user_cancelled, and handle accordingly.

The original message mentions JDK-8221218, but this bug and friends are test issues which trigger the same exception, and likely unrelated to this issue.
Comments
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/22263 Date: 2024-11-20 01:23:10 +0000
20-11-2024

[~mbaesken] 500ms is the client thread socket timeout (which by the way is irrelevant anyhow because we are not reading to this socket, only writing from it). The server thread socket timeout is 2000ms and, judging by the errors you shared, you are having the server timeout exception: `java.lang.RuntimeException: assertEquals expected: class javax.net.ssl.SSLProtocolException but was: class java.net.SocketTimeoutException` 2000ms should be plenty of time for server to read unless you have an overloaded test environment.
19-11-2024

> [~mbaesken] Are you seeing this issue when not running tests concurrently, i.e. with low CPU utilization? Was now running the test standalone on 2 of our test machines a few times. When running standalone I did not see errors. in the central/concurrent tests, the 500ms https://github.com/openjdk/jdk/blob/9d60300feea12d353fcd6c806b196ace2df02d05/test/jdk/sun/security/ssl/SSLCipher/SSLSocketNoServerHelloClientShutdown.java#L128 are not always sufficient ; see also Arnos comment below with a detailed exception. Is there a good reason for 500ms and not a higher value like 1000ms ?
19-11-2024

[~mbaesken] Are you seeing this issue when not running tests concurrently, i.e. with low CPU utilization?
18-11-2024

Still getting lot and lots of failures in the test sun/security/ssl/SSLCipher/SSLSocketNoServerHelloClientShutdown.java across platforms (Linux, macOS, Windows, AIX); should we problemlist this one it generates quite a lot of noise in our tests. Btw is there another issue for this, I wonder why this is now in status "fixed" ?
18-11-2024

After looking a bit more at the test failure - I guess it is not the 2 second timeout but the 500ms we see. Port: 63016 ================= ---Client Wrap client_hello--- ===Server is ready and reading=== java.net.SocketTimeoutException: Read timed out Result Status : OK Result HS Status : NEED_UNWRAP Engine HS Status : NEED_UNWRAP isInboundDone() : false isOutboundDone() : false More Result : Status = OK HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 323 sequenceNumber = 0 ---Client closeOutbound--- ---Client Wrap user_canceled--- Result Status : OK Result HS Status : NEED_WRAP Engine HS Status : NEED_WRAP isInboundDone() : false isOutboundDone() : false More Result : Status = OK HandshakeStatus = NEED_WRAP bytesConsumed = 0 bytesProduced = 7 sequenceNumber = 1 ---Client Wrap close_notify--- Result Status : CLOSED Result HS Status : NEED_UNWRAP Engine HS Status : NEED_UNWRAP isInboundDone() : false isOutboundDone() : true More Result : Status = CLOSED HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 7 sequenceNumber = 2 ---TLS Buffer Inspection. Bytes Remaining: 337--- Flight 1: contentType: 22; majorVersion: 3; minorVersion: 3; contentLen: 318 Flight 2: contentType: 21; majorVersion: 3; minorVersion: 3; contentLen: 2 Flight 3: contentType: 21; majorVersion: 3; minorVersion: 3; contentLen: 2 ---Client sends unencrypted alerts--- java.lang.RuntimeException: assertEquals expected: class javax.net.ssl.SSLProtocolException but was: class java.net.SocketTimeoutException at jdk.test.lib.Asserts.fail(Asserts.java:691) at jdk.test.lib.Asserts.assertEquals(Asserts.java:204) at jdk.test.lib.Asserts.assertEquals(Asserts.java:191) at SSLSocketNoServerHelloClientShutdown.runTest(SSLSocketNoServerHelloClientShutdown.java:103) at SSLSocketNoServerHelloClientShutdown.main(SSLSocketNoServerHelloClientShutdown.java:63) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1576) JavaTest Message: Test threw exception: java.lang.RuntimeException: assertEquals expected: class javax.net.ssl.SSLProtocolException but was: class java.net.SocketTimeoutException JavaTest Message: shutting down test We see "---Client sends unencrypted alerts---" as last log line. If the exception would have been caught in "runTest()" we should see the the initial exception logged, but we only see the assert.
11-11-2024

[~abarashev] SSLSocketSSLEngineCloseInbound did fail sporadically on Windows in our environment, but no more since May of this year - might have been related to the antivirus solution that we had before. Regarding 2 seconds timeout and plenty of time - that is correct for a developer machine running one test at a time. We do run the test with a concurrency related to the numbers of CPUs - that does improve the overall runtime but depending on some test it puts a high load on the machines. I think the jtreg timeout factor would help a lot in cases where machines provide have different speed and load parameters.
11-11-2024

[~azeller] Actually I didn't know about jtreg timeout factor, thanks for suggestion. Does "SSLSocketSSLEngineCloseInbound" test also fail for you? It's a similar test where the server timeout is set to only 0.5s. In "SSLSocketNoServerHelloClientShutdown" client thread waits for 2s for server to read the data, that's why I set the server timeout to 2s as well, it should be plenty of time.
07-11-2024

[~abarashev] Is there a reason for the exact 2 seconds? If not, I would suggest to use the jtreg timeout factor and multiply the timeout value with it. This is done in the test test/jdk/sun/security/ssl/SSLSocketImpl/SSLSocketCloseHang.java. What do you think?
07-11-2024

[~mbaesken] 2 seconds timeout is hard-coded in the test. You should decrease the CPU load. That same test runs fine in our environment for the same architecture.
07-11-2024

We have seen the test failing 4 times so far, so it is no one-time failure. The CPU load seems to be rather high on the machine. Should we try to increase the 2 seconds you mentioned (if so where) ?
07-11-2024

[~mbaesken] Look like 2 seconds wasn't enough time for the client to send the data. Is it always reproducible? How high is the CPU load on the test machine?
06-11-2024

The new test sun/security/ssl/SSLCipher/SSLSocketNoServerHelloClientShutdown.java fails on one of our macOS x86_64 test machines (macOS 14.4.1), as follows : ---Client sends unencrypted alerts--- java.lang.RuntimeException: assertEquals expected: class javax.net.ssl.SSLProtocolException but was: class java.net.SocketTimeoutException at jdk.test.lib.Asserts.fail(Asserts.java:691) at jdk.test.lib.Asserts.assertEquals(Asserts.java:204) at jdk.test.lib.Asserts.assertEquals(Asserts.java:191) at SSLSocketNoServerHelloClientShutdown.runTest(SSLSocketNoServerHelloClientShutdown.java:103) at SSLSocketNoServerHelloClientShutdown.main(SSLSocketNoServerHelloClientShutdown.java:63) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:567) at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138) at java.base/java.lang.Thread.run(Thread.java:1576)
06-11-2024

Changeset: 8b474971 Branch: master Author: Artur Barashev <abarashev@openjdk.org> Date: 2024-11-04 18:46:38 +0000 URL: https://git.openjdk.org/jdk/commit/8b4749713c63a08e502845ed5d0a0236822018cd
04-11-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/21043 Date: 2024-09-17 17:44:37 +0000
17-09-2024