JDK-8229018 : Switching to an infinite socket timeout on Windows leads to high CPU load
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 13
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • OS: windows_10
  • CPU: x86_64
  • Submitted: 2019-08-01
  • Updated: 2020-01-16
  • Resolved: 2019-08-05
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 13 JDK 14
13 b33Fixed 14Fixed
Related Reports
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Windows 10
openjdk version "13-ea" 2019-09-17
OpenJDK Runtime Environment (build 13-ea+31)
OpenJDK 64-Bit Server VM (build 13-ea+31, mixed mode, sharing)

A DESCRIPTION OF THE PROBLEM :
When passing a non-zero value to java.net.Socket.setSoTimeout and then back again causes high CPU load on the thread which is attempting to read from the socket. According to the API docs, a timeout of zero means infinite. From the application perspective, the read is blocking exactly as it should, but the socket implementation appears to be in an infinite polling loop of some kind without waiting. When data becomes available, the read returns as expected.

REGRESSION : Last worked in version 12.0.2

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Connect a socket, and set the timeout to a legal non-zero value. The server side writes one byte. The client then reads the byte from the socket, and then sets the timeout to zero, which means infinite. A subsequent read by the client blocks indefinitely (as expected), but the CPU load isn't zero.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expect to see no CPU activity on the thread while waiting to read. This is the observed behavior when running the test using JDK 12.
ACTUAL -
The CPU load on the thread (and Java process) isn't zero when blocked.

---------- BEGIN SOURCE ----------
import java.net.*;

public class Test {
    public static void main(String[] args) throws Exception {
        ServerSocket ss = new ServerSocket(0);

        new Thread(() -> {
            try {
                Socket s = ss.accept();
                s.getOutputStream().write(1);
            } catch (Exception e) {
            }
        }).start();

        Socket s = new Socket("localhost", ss.getLocalPort());
        System.out.println(s);

        s.setSoTimeout(9999999);
        s.getInputStream().read();
        s.setSoTimeout(0); // should reset to infinite timeout
        while (true) {
            s.getInputStream().read();
        }
    }
}
---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Set the timeout to a very large non zero value, which is effectively infinite.

FREQUENCY : always



Comments
Okay, thanks. This unbreaks my brain a bit.
16-01-2020

The Socket implementation was replaced in JDK 13 so the scenario is in this bug is not applicable to JDK 12 and older.
16-01-2020

The affected code in Java_sun_nio_ch_Net_poll seems to be introduced by JDK-7184932, so this actually affects 8u and 11u too? It is unclear why 11.0.3 and 12.0.2 are working. Adding affected versions provisionally.
16-01-2020

Response from submitter: Yes, it's fixed now. Thanks for the quick turnaround!
12-08-2019

Sent email to submitter requesting to test the latest JDK 13-ea + 33 where the issue is fixed.
12-08-2019

URL: https://hg.openjdk.java.net/jdk/jdk13/rev/b2fde6701654 User: michaelm Date: 2019-08-05 09:41:07 +0000
05-08-2019

Alan's fix for this: diff --git a/src/java.base/windows/native/libnio/ch/Net.c b/src/java.base/windows/native/libnio/ch/Net.c --- a/src/java.base/windows/native/libnio/ch/Net.c +++ b/src/java.base/windows/native/libnio/ch/Net.c @@ -623,9 +623,6 @@ fd_set rd, wr, ex; jint fd = fdval(env, fdo); - t.tv_sec = (long)(timeout / 1000); - t.tv_usec = (timeout % 1000) * 1000; - FD_ZERO(&rd); FD_ZERO(&wr); FD_ZERO(&ex); @@ -638,7 +635,12 @@ } FD_SET(fd, &ex); - rv = select(fd+1, &rd, &wr, &ex, &t); + if (timeout >= 0) { + t.tv_sec = (long)(timeout / 1000); + t.tv_usec = (timeout % 1000) * 1000; + } + + rv = select(fd+1, &rd, &wr, &ex, (timeout >= 0) ? &t : NULL); /* save last winsock error */ if (rv == SOCKET_ERROR) {
02-08-2019

To reproduce the issue, run the attached test case and observe CPU usage for the process in task manager: JDK 11.0.3 - pass JDK 13-ea + 30 - fail JDK 14-ea+7 - Fail Attached screen shots for usage on JDK 11.0.3 vs JDK 14-ea
02-08-2019

Thanks for the bug report. There is indeed a bug here. It's specific to Windows and the new SocketImpl and arises when the timeout is reset to 0 after a previous read (or connect) has used a non-0 timeout. The bug arises when the timeout of 0 is translated to a value of -1 for "block indefinitely" at the native level. The Windows implementation is based on select rather than poll so it needs to further translate to a timeout of NULL.
02-08-2019