JDK-8198266 : java/net/ipv6tests/UdpTest.java fails with timeout
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.net
  • Affected Version: 8,11,13,14,17,18,20,21,22,24
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2018-02-16
  • Updated: 2025-06-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
Test:
java/net/ipv6tests/UdpTest.java

Platform: Mac OS X 10.10.5 (x86_64)

Log:
#section:main
----------messages:(8/219)----------
command: main UdpTest
reason: Assumed action based on file name: run main UdpTest 
Mode: agentvm
Agent id: 6
Timeout refired 1200 times
Timeout information:
--- Timeout information end.
elapsed time (seconds): 1339.018

.....

result: Error. Agent error: java.lang.Exception: Agent 6 timed out with a timeout of 1200 seconds; check console log for any additional details

Comments
the test is stuck in the datagramEcho test "AgentVMThread" #40 [43295] prio=5 os_prio=31 cpu=11.63ms elapsed=492.23s allocated=1031K defined_classes=81 tid=0x00000001411aae00 nid=43295 runnable [0x000000016fad5000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.DatagramChannelImpl.receive0(java.base@24-ea/Native Method) at sun.nio.ch.DatagramChannelImpl.receiveIntoNativeBuffer(java.base@24-ea/DatagramChannelImpl.java:831) at sun.nio.ch.DatagramChannelImpl.receive(java.base@24-ea/DatagramChannelImpl.java:803) at sun.nio.ch.DatagramChannelImpl.tryBlockingReceive(java.base@24-ea/DatagramChannelImpl.java:760) at sun.nio.ch.DatagramChannelImpl.blockingReceive(java.base@24-ea/DatagramChannelImpl.java:692) at sun.nio.ch.DatagramSocketAdaptor.receive(java.base@24-ea/DatagramSocketAdaptor.java:204) at java.net.DatagramSocket.receive(java.base@24-ea/DatagramSocket.java:728) at Tests.datagramEcho(Tests.java:134) at UdpTest.test1(UdpTest.java:128) at UdpTest.main(UdpTest.java:87) at java.lang.invoke.LambdaForm$DMH/0x000000f8000c1000.invokeStatic(java.base@24-ea/LambdaForm$DMH) at java.lang.invoke.LambdaForm$MH/0x000000f800144800.invoke(java.base@24-ea/LambdaForm$MH) at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@24-ea/Invokers$Holder) at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base@24-ea/DirectMethodHandleAccessor.java:154) at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base@24-ea/DirectMethodHandleAccessor.java:103) at java.lang.reflect.Method.invoke(java.base@24-ea/Method.java:573) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333) at java.lang.Thread.runWith(java.base@24-ea/Thread.java:1588) at java.lang.Thread.run(java.base@24-ea/Thread.java:1575) Note the failure is on the second send/receive scenario of datagramEcho for ipv4 address while a previous failure was on the first send/receive scenario of datagramEcho for ipv4 address (with syslogd being a potential receiver) This failure datagramEcho scenario is S1 == ipv4 + ephemeral port S2 == Wildcard + ephemeral port s1 sends to s2 s2 receives from s1 s2 send to s1 s1 receives from s1 <---------- stuck here the ephemeral port used by s1 is unknown. Having that in the diagnostics output of the test would be beneficial netstat -m show some peculiar statistic for mbuf allocation, which may hint at datagrams being dropped ---------------------------------------- [2024-09-09 13:24:30] [/usr/sbin/netstat, -mm] timeout=20000 in /System/Volumes/Data/mesos/work_dir/slaves/6bbe0543-8c5a-457e-b0ca-dfa2833be967-S6279/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/918c81b3-a7de-4934-8cc5-756acb4878d2/runs/0b71d3a2-a3cf-49a2-a077-19e03d76f1f5/testoutput/test-support/jtreg_open_test_jdk_tier2_part3/java/net/ipv6tests/UdpTest ---------------------------------------- class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 512 709 5058 5152 on 84 4359 2.5 MB cl 2048 88 1840 1928 on 140 1700 3.6 MB bigcl 4096 88 936 1024 on 85 851 3.7 MB 16kcl 16384 82 95 177 on 85 10 1.5 MB mbuf_cl 2560 1 88 88 on 77 10 220.0 KB mbuf_bigcl 4608 0 88 88 on 88 0 396.0 KB mbuf_16kcl 16896 0 82 82 on 72 10 1.3 MB 452/709 mbufs in use: 280 mbufs allocated to data 172 mbufs allocated to packet headers 257 mbufs allocated to caches 1/1928 mbuf 2KB clusters in use 0/1024 mbuf 4KB clusters in use 0/177 mbuf 16KB clusters in use 13442 KB allocated to network (2.3% in use) 3403856 requests for memory denied <------- This is a potential issue for dropped datagrams ---------------------------------------- [2024-09-09 13:24:30] exit code: 0 time: 2 ms a fuller diagnostic on sender and receiver endpoints would be helpful
09-09-2024

Here's a log file snippet for the jdk-21+11-882-tier2 sighting: java/net/ipv6tests/UdpTest.java #section:main ----------messages:(10/289)---------- command: main UdpTest -d reason: User specified action: run main UdpTest -d started: Wed Feb 22 07:32:11 GMT 2023 Mode: agentvm Agent id: 18 Timeout refired 480 times Timeout information: --- Timeout information end. finished: Wed Feb 22 07:40:43 GMT 2023 elapsed time (seconds): 512.008 ----------configuration:(14/1945)---------- <snip> result: Error. Agent error: java.lang.Exception: Agent 18 timed out with a timeout of 480 seconds; check console log for any additional details
22-02-2023

Here's a log file snippet from the jdk-20+29-2281-tier2 sighting: java/net/ipv6tests/UdpTest.java #section:main ----------messages:(10/289)---------- command: main UdpTest -d reason: User specified action: run main UdpTest -d started: Wed Dec 21 20:50:27 GMT 2022 Mode: agentvm Agent id: 19 Timeout refired 480 times Timeout information: --- Timeout information end. finished: Wed Dec 21 20:58:59 GMT 2022 elapsed time (seconds): 512.509 ----------configuration:(14/1947)---------- <snip> result: Error. Agent error: java.lang.Exception: Agent 19 timed out with a timeout of 480 seconds; check console log for any additional details
21-12-2022

This test and many other UDP based test has an inherent flaw, in that it assumes that the send/receive scenarios always execute reliably, especially as there are same host or loopback scenarios. UDP is intrinsically and characteristically unreliable. So, there is always a possibility that these test will fail, most likely in heavy load and high network activity conditions -- these might trigger memory management issues, typically UDP requests are non blocking requests, where datagrams are unceremoniously dropped, resulting in no delivery and a forever blocking receive or thread scheduling issues where the large numbers of threads cause exaggerated wait times and timeouts. To make this test 100% reliable requires some send/receive failure detection and recovery logic. also executing in othervm will enhance the determinacy of the test
08-02-2022

test involves DatagramSockets that are bound to wildcard and ephemeral ports, sending of IPv4 addressed datagram, sending of ipv6 addressed datagram test appears to be moribund waiting on receive of ipv6 addressed datagram, which is not delivered (took a walk on the wild side -- there's a little joke there somewhere!!) "AgentVMThread" #50 prio=5 os_prio=31 cpu=18.66ms elapsed=488.57s allocated=304K defined_classes=36 tid=0x00007fc53e01da00 nid=39443 runnable [0x000070000a98e000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.DatagramChannelImpl.receive0(java.base@18-ea/Native Method) at sun.nio.ch.DatagramChannelImpl.receiveIntoNativeBuffer(java.base@18-ea/DatagramChannelImpl.java:750) at sun.nio.ch.DatagramChannelImpl.receive(java.base@18-ea/DatagramChannelImpl.java:728) at sun.nio.ch.DatagramChannelImpl.trustedBlockingReceive(java.base@18-ea/DatagramChannelImpl.java:666) at sun.nio.ch.DatagramChannelImpl.blockingReceive(java.base@18-ea/DatagramChannelImpl.java:635) at sun.nio.ch.DatagramSocketAdaptor.receive(java.base@18-ea/DatagramSocketAdaptor.java:240) at java.net.DatagramSocket.receive(java.base@18-ea/DatagramSocket.java:701) at Tests.simpleDataExchange(Tests.java:100) at UdpTest.test1(UdpTest.java:101) at UdpTest.main(UdpTest.java:87) at java.lang.invoke.LambdaForm$DMH/0x00000007c0001c00.invokeStatic(java.base@18-ea/LambdaForm$DMH) at java.lang.invoke.LambdaForm$MH/0x00000007c00cd800.invoke(java.base@18-ea/LambdaForm$MH) at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base@18-ea/Invokers$Holder) at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base@18-ea/DirectMethodHandleAccessor.java:155) at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base@18-ea/DirectMethodHandleAccessor.java:104) at java.lang.reflect.Method.invoke(java.base@18-ea/Method.java:577) at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312) at java.lang.Thread.run(java.base@18-ea/Thread.java:833) test logs have no test diagnostic output. Test stuck in second datagram receive. AgentVM mode in play. Using othervm will allow some diagnostic output (I think) in this moribund scenario. ifconfig shows nothing unusual. netstat show no memory pressure and all looks reasonable class buf active ctotal total cache cached uncached memory name size bufs bufs bufs state bufs bufs usage ---------- ----- -------- -------- -------- ----- -------- -------- --------- mbuf 256 17608 21145 31232 on 12640 984 5.2 MB cl 2048 3959 707 4666 on 561 146 1.4 MB bigcl 4096 6128 4805 10933 on 0 4805 18.8 MB 16kcl 16384 7515 2250 9765 on 0 2250 35.2 MB mbuf_cl 2304 146 3959 3959 on 3813 0 8.7 MB mbuf_bigcl 4352 0 6128 6128 on 6128 0 25.4 MB mbuf_16kcl 16640 0 7515 7515 on 7515 0 119.3 MB 152/17608 mbufs in use: 151 mbufs allocated to data 1 mbufs allocated to socket names and addresses 17456 mbufs allocated to caches 146/4666 mbuf 2KB clusters in use 0/10933 mbuf 4KB clusters in use 0/9765 mbuf 16KB clusters in use 218990 KB allocated to network (1.0% in use) 0 KB returned to the system 0 requests for memory denied 0 requests for memory delayed 0 calls to drain routines The above shows no indication of memory pressure i.e. no drain calls, which have been closely associated with Datagram loss in tests on the other macosx-aarch64 platform. Kernel parameters have network memory configured for growth to 512MB kern.ipc.nmbclusters: 262144 kern.ipc.soqlimitcompat: 1 kern.ipc.io_policy.log: 0 kern.ipc.io_policy.uuid: 1 kern.ipc.mb_drain_force: 0 kern.ipc.mb_drain_maxint: 0 kern.ipc.mb_memory_pressure_percentage: 80 12 cores -- concurrency == 6 processes all seem reasonable by contemporary standards Processes: 211 total, 3 running, 208 sleeping, 893 threads 2021/11/16 19:30:55 Load Avg: 3.68, 4.96, 5.12 CPU usage: 6.52% user, 9.78% sys, 83.69% idle SharedLibs: 1543M resident, 218M data, 559M linkedit. MemRegions: 32569 total, 1762M resident, 63M private, 331M shared. PhysMem: 9995M used (2861M wired), 22G unused. system param for process threads kern.num_taskthreads: 8192 cursory glance at captured test environment info show no indication network memory issue. Conjecture -- there is an intermittent send issue from wildcard address to ipv6 or another datagram socket has inadvertently been allocated the same ephemeral dest port? netstat -av was not captured. running in othervm mode for other similarly failing tests helps with the diagnostics at least. add othervm to test.
16-11-2021

Here's a log file snippet for the jdk-17+29-2558-tier2 sighting: java/net/ipv6tests/UdpTest.java #section:main ----------messages:(8/212)---------- command: main UdpTest -d reason: User specified action: run main UdpTest -d Mode: agentvm Agent id: 22 Timeout refired 480 times Timeout information: --- Timeout information end. elapsed time (seconds): 490.797 ----------configuration:(16/2109)---------- <snip> result: Error. Agent error: java.lang.Exception: Agent 22 timed out with a timeout of 480 seconds; check console log for any additional details
28-06-2021

Snippet of the log file for the failure spotted in the JDK13 CI: #section:main ----------messages:(8/205)---------- command: main UdpTest reason: User specified action: run main UdpTest Mode: agentvm Agent id: 5 Timeout refired 480 times Timeout information: --- Timeout information end. elapsed time (seconds): 496.718 ----------configuration:(11/1393)----------
07-05-2019