We noticed a major degradation in throughput in our app after upgrading from JDK 11 to JDK 17/19/21-ea.
The application uses a shared instance of DatagramSocket in multiple threads. The root cause of the performance drop is JDK-8235674, which re-implemented DatagramSocket using DatagramChannel. The problem with the new implementation is that it acquires a ReentrantLock on every send/receive call [1], thus limiting I/O scalability. The old implementation - AbstractPlainDatagramSocketImpl - did not have a per-socket lock on send/receive path.
I attached the simple test case that demonstrates the issue: DatagramTest.java
Here are the sample performance results using 4 threads, measured in packets per second (the higher the better).
$ java DatagramTest 4
278309 pps
336652 pps
342375 pps
335843 pps
327992 pps
$ java -Djdk.net.usePlainDatagramSocketImpl DatagramTest 4
1616246 pps
1692003 pps
1696878 pps
1693931 pps
1710490 pps
The old implementation (AbstractPlainDatagramSocketImpl) shows 5x higher throughput than the new implementation (DatagramChannel).
The corresponding CPU profiles:
Old (faster): https://cr.openjdk.org/~apangin/8303616/old.html
New (slower): https://cr.openjdk.org/~apangin/8303616/new.html
(note the large volume of ReentrantLock.lock/unlock on the graph)
The workaround for JDK 17 is the option -Djdk.net.usePlainDatagramSocketImpl that enables the old implementation. Unfortunately, this no longer works since JDK-8253119.
send() and recv() calls are thread safe in POSIX and Windows, they should not be guarded by a Java level lock.
[1] https://github.com/openjdk/jdk/blob/5b2e2e4695768a6bd8090fb9a6c342fcddcbb3fd/src/java.base/share/classes/sun/nio/ch/DatagramChannelImpl.java#L827
https://github.com/openjdk/jdk/blob/5b2e2e4695768a6bd8090fb9a6c342fcddcbb3fd/src/java.base/share/classes/sun/nio/ch/DatagramChannelImpl.java#L891