JDK-8273158 : Tests failing with "SocketException: No buffer space available" [macos-aarch64]
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 17,18,19,20,23
  • Priority: P4
  • Status: In Progress
  • Resolution: Unresolved
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2021-08-31
  • Updated: 2025-12-12
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8286273 :  
Description
A mitigation was put in place in JDK-8269772 but the problem still crops up.

Note that this is the compilation that failed (javac) not the test itself.
Comments
> Tests for memory leaks are often problematic. No objection to deleting OpenLeak.java if it comes up. I had a look at the history of these 2 tests. OpenLeak.java test has been around for at least 18 odd years without any major problems on various platforms. CloseDuringConnect.java has been there for around 7 years now and has been stable too. I think instead of deleting them perhaps we could add a "@requires" to exclude it (only) on macos 26 and higher.
02-12-2025

Tests for memory leaks are often problematic. No objection to deleting OpenLeak.java if it comes up.
02-12-2025

kern.ipc.mb_memory_pressure_percentage: 80 is the key system kernel parameter netstat shows that there is memory pressure on this system ---------------------------------------- [2025-10-31 13:23:05] [/usr/sbin/netstat, -mm] timeout=20000 in /System/Volumes/Data/mesos/work_dir/slaves/526fbd26-20de-495c-9a19-a04adc16f7d1-S19343/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/2972b215-477b-4679-bcb9-2b2dd5514653/runs/86bd91b1-0a35-47c9-9983-692ce8c8f597/testoutput/test-support/jtreg_open_test_jdk_tier2_part2/java/nio/channels/SocketChannel/OpenLeak ---------------------------------------- class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 512 3723 30480 31744 on 550 27471 14.9 MB cl 2048 444 964 1408 on 168 796 1.9 MB bigcl 4096 2033 14139 16172 on 148 13991 55.2 MB 16kcl 16384 1213 608 1821 on 104 504 9.5 MB mbuf_cl 2560 303 444 444 on 141 0 1.1 MB mbuf_bigcl 4608 0 2033 2033 on 451 1582 8.9 MB mbuf_16kcl 16896 860 1213 1213 on 171 182 19.5 MB 1196/3723 mbufs in use: 1193 mbufs allocated to data 3 mbufs allocated to packet headers 2527 mbufs allocated to caches 303/1408 mbuf 2KB clusters in use 0/16172 mbuf 4KB clusters in use 860/1821 mbuf 16KB clusters in use 113725 KB allocated to network (14.2% in use) 41825606 requests for memory denied and vmstat shows lots of memory activity In the case of TCP socket creation BUF size is set very large and under memory pressure conditions can result in ENOBUF There is lots of network activity in particular, lots of TCP and unix domain sockets have been created and lots of TCP sockets are being released. One of the tests is creating and releasing TCP sockets in rapid succession, and it just hits a temporary resource crisis. It is likely this transient and nothing to do but try again .... so maybe add a check for ENOBUF from the socket call or subsequent setsockopt ... add a little back off and try again or add a an SocketNoBufsException, so that the Net.socket can retry the socket creation. It is most likely just a transient condition, and without explicit ENOBUF handler logic or some retry strategy, we'll get these types of fails there's a dmesg " memacct: TCP goes from 0 to 1 for its limitSK[3]: flow_entry_alloc " which might indicate that some memory limits have been reached. This would correlate with netstat statistics and the memory pressure 80% limit. When network memory resources reach a limit, this will trigger a request for additional resources. Mbufs allocation and such like, will be dynamically allocated to meet demands. But that dynamic allocation may result in a temporary resource drought, while the system resources are expanded and some requests can't be be met.
03-11-2025

I happened to run into this "java.net.SocketException: No buffer space available" issue in a personal CI job in two different tests. Both against macos 26 and both of them for the socket0() JNI function: Caused by: java.net.SocketException: No buffer space available at java.base/sun.nio.ch.Net.socket0(Native Method) at java.base/sun.nio.ch.Net.socket(Net.java:491) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:155) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:138) at java.base/sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:77) at java.base/java.nio.channels.SocketChannel.open(SocketChannel.java:192) at CloseDuringConnect.test(CloseDuringConnect.java:98) and Caused by: java.net.SocketException: No buffer space available at java.base/sun.nio.ch.Net.socket0(Native Method) at java.base/sun.nio.ch.Net.socket(Net.java:491) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:155) at java.base/sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:138) at java.base/sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:77) at java.base/java.nio.channels.SocketChannel.open(SocketChannel.java:192) at java.base/java.nio.channels.SocketChannel.open(SocketChannel.java:274) at OpenLeak.lambda$test$0(OpenLeak.java:122) socket0() JNI function does a socket() syscall as well as a setsockopt() syscall. Both are specified to return a ENOBUFS error, but it's odd that this has started happening only on macos 26 in this code path. Having said that, both CloseDuringConnect and OpenLeak tests do "stress" the SocketChannel.open() calls (by calling it in a loop), so it may be that on macos 26 some threshold may have changed and this error shows up regularly. I'll take a deeper look next week and also compare some sysctl values on these hosts with the 15.x hosts.
01-11-2025

The latest sighting on macOS 26 suggests that socket(2) failed with ENOBUFS, I don't know if we've seen this specific case before.
01-11-2025

This latest failure is within jtreg, so it's not the test per se that is failing netstat shows the pathology, the drains, delays and mbuf memory allocations can be seen and system kernel parameter kern.ipc.mb_memory_pressure_percentage: 80 which triggers an expansion of mbuf memory. With the likelihood of more and more non blocking networking i/o being used, ENOBUF increases in probablility when network usage is heavy. there is naught that can be done except attempt to handle this condition in tests and in the jtreg framework, with retry logic. A derived SocketException, such as NoBufferSocketException would assist in this respect. ---------------------------------------- [2024-02-07 20:17:16] [/usr/sbin/netstat, -mm] timeout=20000 ---------------------------------------- class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 256 6749 35745 35968 purge 0 29221 8.7 MB cl 2048 64 6025 6088 purge 0 6024 11.8 MB bigcl 4096 160 4564 4724 purge 0 4564 17.8 MB 16kcl 16384 5462 0 5462 on 0 0 0 mbuf_cl 2304 63 63 63 purge 0 0 141.8 KB mbuf_bigcl 4352 160 160 160 purge 0 0 680.0 KB mbuf_16kcl 16640 3109 5462 5462 on 2353 0 86.7 MB 4400/6749 mbufs in use: 4384 mbufs allocated to data 16 mbufs allocated to packet headers 2349 mbufs allocated to caches 64/6088 mbuf 2KB clusters in use 160/4724 mbuf 4KB clusters in use 3109/5462 mbuf 16KB clusters in use 128821 KB allocated to network (41.1% in use) 0 KB returned to the system 0 requests for memory denied 2008 requests for memory delayed 115 calls to drain routines ---------------------------------------- [2024-02-07 20:17:16] exit code: 0 time: 4 ms
08-02-2024

Here's a log file snippet from the jdk-23+9-635-tier2 sighting: java/nio/channels/Channels/ReadByte.java #section:main ----------messages:(7/218)---------- command: main ReadByte reason: Assumed action based on file name: run main ReadByte started: Wed Feb 07 20:16:49 GMT 2024 Mode: agentvm Agent id: 13 finished: Wed Feb 07 20:16:50 GMT 2024 elapsed time (seconds): 1.13 ----------configuration:(12/1537)---------- <snip> result: Error. Agent communication error: java.net.SocketException: No buffer space available; check console log for any additional details
07-02-2024

sysctl kern.ipc.mb_memory_pressure_percentage: 80 ---------------------------------------- [2022-12-05 02:50:02] [/usr/sbin/netstat, -mm] timeout=20000 ---------------------------------------- class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 256 5987 16280 16704 on 6107 4610 4.0 MB cl 2048 423 2026 2448 on 0 2025 4.0 MB bigcl 4096 2 5834 5836 on 0 5834 22.8 MB 16kcl 16384 5462 0 5462 on 0 0 0 mbuf_cl 2304 59 422 422 on 363 0 949.5 KB mbuf_bigcl 4352 0 2 2 on 2 0 8.5 KB mbuf_16kcl 16640 0 5462 5462 on 5462 0 86.7 MB 160/5987 mbufs in use: 144 mbufs allocated to data 16 mbufs allocated to packet headers 5827 mbufs allocated to caches 60/2448 mbuf 2KB clusters in use 0/5836 mbuf 4KB clusters in use 0/5462 mbuf 16KB clusters in use 121173 KB allocated to network (1.3% in use) 0 KB returned to the system 0 requests for memory denied 519 requests for memory delayed 181 calls to drain routines ---------------------------------------- [2022-12-05 02:50:02] exit code: 0 time: 3 ms ----------System.err:(34/2558)---------- DNSServer: Error: java.net.SocketException: No buffer space available java.net.SocketException: No buffer space available at java.base/sun.nio.ch.DatagramChannelImpl.send0(Native Method) at java.base/sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:935) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:897) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:855) at java.base/sun.nio.ch.DatagramChannelImpl.blockingSend(DatagramChannelImpl.java:887) at java.base/sun.nio.ch.DatagramSocketAdaptor.send(DatagramSocketAdaptor.java:220) at java.base/java.net.DatagramSocket.send(DatagramSocket.java:662) at DNSServer.sendResponse(DNSServer.java:189) at DNSServer.run(DNSServer.java:137) javax.naming.CommunicationException: DNS error [Root exception is java.net.SocketTimeoutException]; remaining name 'sdffdfsfgsfsf.com' at jdk.naming.dns/com.sun.jndi.dns.DnsClient.query(DnsClient.java:341) at jdk.naming.dns/com.sun.jndi.dns.Resolver.query(Resolver.java:81) at jdk.naming.dns/com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:434) at java.naming/com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:235) at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:141) at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:129) at java.naming/javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:171) at ExhaustXIDs.runTest(ExhaustXIDs.java:58) at ExhaustXIDs.main(ExhaustXIDs.java:30) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:578) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:125) at java.base/java.lang.Thread.run(Thread.java:1599) Caused by: java.net.SocketTimeoutException at jdk.naming.dns/com.sun.jndi.dns.DnsClient.doUdpQuery(DnsClient.java:472) at jdk.naming.dns/com.sun.jndi.dns.DnsClient.query(DnsClient.java:242) ... 12 more
05-12-2022

Unless the Socket/DatagramSocket and the NIO network Channel frameworks are modified to detect, and handle the ENOBUFS scenarios -- no buffer space available, then the failure above relating to java/nio/channels/Channels/ShortWrite.java is an issue for jtreg to handle and to execute some form of remedial/recovery strategy when a compile action has failed due to Error. Agent communication error: java.net.SocketException: No buffer space available This might be a wait for few seconds and retry strategy for a jtreg Action. Nonetheless, the handling of ENOBUFS within the JDK networking warrants a refreshed debate and discussion as to how best to provide relevant and appropriate Exceptions for handling these failure scenarios an possible user story is: "As a developer I would like the JDK networking APIs to provide an appropriate Exception abstraction that conveys operating system call error such as ENOBUFS so that my application code can invoke a recovery handling strategy."
15-04-2022

the latest failure in this failure series is com/sun/jndi/dns/ExhaustXIDs.java where the Exception is in the DNSServer , which is part of test library. The no Buffer space is typically a transient condition, where the system can provide an mbuf for a "non blocking" memory request. In the test scenario the DNS interactions are over UDP. As this is a UDP send, memory requests tend to be M_DONTWAIT (as in Xnu udp_output). /* * Calculate data length and get a mbuf * for UDP and IP headers. */ M_PREPEND(m, sizeof(struct udpiphdr), M_DONTWAIT, 1); if (m == 0) { error = ENOBUFS; goto abort; } Thus, if there is memory pressure at a particular instance then there exists the possibility that an ENOBUFS error will be returned, as such the socket (DatagramChannelImpl) send throws an SocketException within the DNSServer, and the test fails. In this particular scenario it is possible to address the "no buffer space" condition with some send retry logic. Assuming the condition is transient, apply a short wait and retry the send. The retry can be applied at two possible points in the call flow DNSServer sendResponse method private void sendResponse(DatagramPacket reqPacket, int playbackIndex) throws IOException { byte[] payload = generateResponsePayload(reqPacket, playbackIndex); socket.send(new DatagramPacket(payload, payload.length, reqPacket.getSocketAddress())); System.out.println("DNSServer: send response message to " + reqPacket .getSocketAddress()); } catch (SocketException soEx) { String exMessage = soEx.getMessage(); if ((exMessage != null) && (exMessage.contain("No buffer space available")) { pauseAWhile(); socket.send(new DatagramPacket(payload, payload.length, reqPacket.getSocketAddress())); } else { throw soEx; } } by applying a try catch to the socket.send, if a SocketException is caught and it is the results of no buffer space error then retry the socket.send. Alternative, in the DatagramChannelImpl::sendFromNativeBuffer, extend the existing try catch on the native send0 method private int sendFromNativeBuffer(FileDescriptor fd, ByteBuffer bb, InetSocketAddress target) throws IOException { int pos = bb.position(); int lim = bb.limit(); assert (pos <= lim); int rem = (pos <= lim ? lim - pos : 0); int written; try { int addressLen = targetSocketAddress(target); written = send0(fd, ((DirectBuffer)bb).address() + pos, rem, targetSockAddr.address(), addressLen); } catch (PortUnreachableException pue) { if (isConnected()) throw pue; written = rem; } if (written > 0) bb.position(pos + written); return written; } catch a SocketException and check that the exception message contains "No buffer space available", and retry the send catch (SocketException soEx) { String exMessage = soEx.getMessage(); if ((exMessage != null) && (exMessage.contain("No buffer space available")) { waitForAWhile(); retrySendFromNativeBuffer(fd, bb, target); } else { throw soEx; } } The netstat capture shows which has been observed to be indicative of this transient no buffer space condition. Also there is a system kernel parameter mb_memory_pressure_percentage , which has also been observed, while trying to replicate this no buffer space condition, to be influential, in that when the mbuf usage reaches 80% then the probability of a ENOBUF error on a network system call increases. Also note that kern.ipc.nmbclusters: 131072 is 256MB, while the current usage is at 106 MB approx, and a significant proportion of main memory remains unused PhysMem: 6943M used (1275M wired), 8919M unused, so mbuf expansion is possible. The netstat -mm capture indicates network memory pressure exists in this system. ---------------------------------------- [2022-04-13 16:30:21] [/usr/sbin/netstat, -mm] timeout=20000 ---------------------------------------- class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 256 6010 25080 25536 on 15342 4184 6.1 MB cl 2048 456 2641 3096 on 6 2634 5.2 MB bigcl 4096 1 1975 1976 on 0 1975 7.7 MB 16kcl 16384 5462 0 5462 on 0 0 0 mbuf_cl 2304 55 455 455 on 400 0 1023.8 KB mbuf_bigcl 4352 0 1 1 on 1 0 4.2 KB mbuf_16kcl 16640 0 5462 5462 on 5462 0 86.7 MB 147/6010 mbufs in use: 131 mbufs allocated to data 16 mbufs allocated to packet headers 5863 mbufs allocated to caches 56/3096 mbuf 2KB clusters in use 0/1976 mbuf 4KB clusters in use 0/5462 mbuf 16KB clusters in use 109237 KB allocated to network (1.4% in use) 0 KB returned to the system 0 requests for memory denied 1577 requests for memory delayed 678 calls to drain routines the kernel config parameter kern.ipc.mb_memory_pressure_percentage: 80 indicates that when the current network memory is at 80% usage, either some shuffling of mbufs is activated or a memory expansion takes place kern.ipc.nmbclusters: 131072 indicates a configuration of 256 MB alloted to network memory -- there is currently less than that allocated. This "No Buffer space available" type of failure should prompt a discussion on whether there is sufficient semantics in the SocketException to encapsulate this low level error condition in the OS networking stack, such as ENOBUFS, which can be set due to a failed network system call. Should a more specialised derivation of SocketException be created to express the no buffer space condition (ENOBUFS), and thus provide an application the capability to catch the condition and apply some appropriate remedial or recovery action,
15-04-2022

Spotted in the jdk-19+18-1211-tier2 CI job set: com/sun/jndi/dns/ExhaustXIDs.java https://mach5.us.oracle.com/mdash/jobs/mach5-one-jdk-19+18-1211-tier2-20220413-1624-31181934/tasks/mach5-one-jdk-19+18-1211-tier2-20220413-1624-31181934-closed_test_jdk_tier2-macosx-aarch64-176/results?search=status%3Afailed%20AND%20-state%3Ainvalid https://mach5.us.oracle.com:10060/api/v1/results/mach5-one-jdk-19+18-1211-tier2-20220413-1624-31181934-closed_test_jdk_tier2-macosx-aarch64-176-1649868143-36/log macosx-aarch64: jpg-mac-arm-707.oraclecorp.com Here's a log file snippet: DNSServer: send response message to /127.0.0.1:61920 DNSServer: received query message from /127.0.0.1:64605 Got exception: javax.naming.CommunicationException: DNS error [Root exception is java.net.SocketTimeoutException: Receive timed out]; remaining name 'sdffdfsfgsfsf.com' retrying once ----------System.err:(38/2980)---------- DNSServer: Error: java.net.SocketException: No buffer space available java.net.SocketException: No buffer space available at java.base/sun.nio.ch.DatagramChannelImpl.send0(Native Method) at java.base/sun.nio.ch.DatagramChannelImpl.sendFromNativeBuffer(DatagramChannelImpl.java:901) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:863) at java.base/sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:821) at java.base/sun.nio.ch.DatagramChannelImpl.blockingSend(DatagramChannelImpl.java:853) at java.base/sun.nio.ch.DatagramSocketAdaptor.send(DatagramSocketAdaptor.java:218) at java.base/java.net.DatagramSocket.send(DatagramSocket.java:665) at DNSServer.sendResponse(DNSServer.java:189) at DNSServer.run(DNSServer.java:137) javax.naming.CommunicationException: DNS error [Root exception is java.net.SocketTimeoutException: Receive timed out]; remaining name 'sdffdfsfgsfsf.com' at jdk.naming.dns/com.sun.jndi.dns.DnsClient.query(DnsClient.java:321) at jdk.naming.dns/com.sun.jndi.dns.Resolver.query(Resolver.java:81) at jdk.naming.dns/com.sun.jndi.dns.DnsContext.c_getAttributes(DnsContext.java:434) at java.naming/com.sun.jndi.toolkit.ctx.ComponentDirContext.p_getAttributes(ComponentDirContext.java:235) at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:141) at java.naming/com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.getAttributes(PartialCompositeDirContext.java:129) at java.naming/javax.naming.directory.InitialDirContext.getAttributes(InitialDirContext.java:171) at ExhaustXIDs.runTest(ExhaustXIDs.java:58) at ExhaustXIDs.main(ExhaustXIDs.java:30) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104) at java.base/java.lang.reflect.Method.invoke(Method.java:578) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) at java.base/java.lang.Thread.run(Thread.java:828) Caused by: java.net.SocketTimeoutException: Receive timed out at java.base/sun.nio.ch.DatagramChannelImpl.trustedBlockingReceive(DatagramChannelImpl.java:703) at java.base/sun.nio.ch.DatagramChannelImpl.blockingReceive(DatagramChannelImpl.java:633) at java.base/sun.nio.ch.DatagramSocketAdaptor.receive(DatagramSocketAdaptor.java:238) at java.base/java.net.DatagramSocket.receive(DatagramSocket.java:701) at jdk.naming.dns/com.sun.jndi.dns.DnsClient.doUdpQuery(DnsClient.java:430) at jdk.naming.dns/com.sun.jndi.dns.DnsClient.query(DnsClient.java:216) ... 12 more JavaTest Message: Test threw exception: javax.naming.CommunicationException: DNS error [Root exception is java.net.SocketTimeoutException: Receive timed out]; remaining name 'sdffdfsfgsfsf.com' JavaTest Message: shutting down test
13-04-2022

Here's a log file snippet from the jdk-19+13-757-tier2 sighting: java/nio/channels/Channels/ShortWrite.java #section:build ----------messages:(5/133)---------- command: build ShortWrite reason: Named class compiled on demand Test directory: compile: ShortWrite elapsed time (seconds): 0.014 result: Error. Agent communication error: java.net.SocketException: No buffer space available; check console log for any additional details So this test failure also happened in the build phase.
04-03-2022

ok will organise that. :+1
01-09-2021

[~msheppar] maybe you should just push a simple changeset that changes this test to use /othervm (you could use a subtask of this bug for that). This way next time the test fail in the CI we might see something in the log for System.err/System.out. Most of the time these seem to be empty - and I am blaming the agent VM for that too! Incremental improvements to the diagnosability would be welcome :-)
01-09-2021

[~alanb] that is a good point and it could be possible. The ifconfig is regularly checked for these type of failures, but we have not seen any evidence of such reconfigurations, unlike some of the linux machine which seem to retain deprecated autonconf IPv6 global address config - the autoconf is typically for global IPv6 address allocations. For the sibling bug JDK-8264385, which is the scenario where the receive is moribund, I have used othervm and this allows outputting of the address bindings being in the test. The IPv6 address is the same as that represented in the ifconfig output. It is something that is always worth checking to see if there are any anomalies in the network interface configurations.
01-09-2021

[~msheppar] Is there any evidence that the system is being reconfigured while these tests run? I wonder if there is a DHCPv6 or something else that is changing.
01-09-2021