JDK-8308048 : RunThese8H.java failed "assert(false) failed: Exceeded max buffer size for this string."
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 11,21
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • OS: linux
  • CPU: x86_64
  • Submitted: 2023-05-14
  • Updated: 2024-07-04
Related Reports
Relates :  
Description
The following test failed in the JDK21 CI:

applications/runthese/RunThese8H.java

Here's snippets from the hs_err_pid file:

#  Internal Error (/opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S100001/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/4bffef37-b607-4e11-a688-94ad227a52f8/runs/1328c16f-d0c3-4a7c-9b54-4d8f083bcdb6/workspace/open/src/hotspot/share/utilities/ostream.cpp:1046), pid=3024, tid=3061
#  assert(false) failed: Exceeded max buffer size for this string.
#
# JRE version: Java(TM) SE Runtime Environment (21.0+23) (fastdebug build 21-ea+23-LTS-1920)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 21-ea+23-LTS-1920, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x14a1359]  bufferedStream::write(char const*, unsigned long)+0x1a9

<snip>

---------------  T H R E A D  ---------------

Current thread (0x00007f111c23d960):  VMThread "VM Thread"          [id=3061, stack(0x00007f11043d9000,0x00007f11044d9000) (1024K)]

Stack: [0x00007f11043d9000,0x00007f11044d9000],  sp=0x00007f11044d6d50,  free space=1015k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x14a1359]  bufferedStream::write(char const*, unsigned long)+0x1a9  (ostream.cpp:1046)
V  [libjvm.so+0x14a17c9]  outputStream::do_vsnprintf_and_write_with_automatic_buffer(char const*, __va_list_tag*, bool)+0x109  (ostream.cpp:130)
V  [libjvm.so+0x14a1ffc]  outputStream::print(char const*, ...)+0x16c  (ostream.cpp:143)
V  [libjvm.so+0xdbb948]  print_classname(outputStream*, Klass*)+0x78  (heapInspection.cpp:442)
V  [libjvm.so+0xdbd444]  KlassHierarchy::print_class(outputStream*, KlassInfoEntry*, bool)+0x194  (heapInspection.cpp:472)
V  [libjvm.so+0xdbf005]  KlassHierarchy::print_class_hierarchy(outputStream*, bool, bool, char*)+0x5b5  (heapInspection.cpp:386)
V  [libjvm.so+0x1872de7]  VM_Operation::evaluate()+0x137  (vmOperations.cpp:71)
V  [libjvm.so+0x1895aeb]  VMThread::evaluate_operation(VM_Operation*)+0x10b  (vmThread.cpp:281)
V  [libjvm.so+0x18965ff]  VMThread::inner_execute(VM_Operation*)+0x35f  (vmThread.cpp:428)
V  [libjvm.so+0x18967d4]  VMThread::loop()+0x84  (vmThread.cpp:495)
V  [libjvm.so+0x18968d7]  VMThread::run()+0x87  (vmThread.cpp:175)
V  [libjvm.so+0x17946d8]  Thread::call_run()+0xb8  (thread.cpp:217)
V  [libjvm.so+0x149138a]  thread_native_entry(Thread*)+0x11a  (os_linux.cpp:740)

Since we're trying to do a KlassHierarchy::print_class_hierarchy from
the VMThread, I'm starting this off in hotspot/runtime for initial triage.
Comments
The current plan is to get JDK-8326820 to lower the number of dead classes that are kept around. That should make this issue much less likely to happen with ZGC.
27-06-2024

This is still being seen in the CI occasionally, mainly in Loom CI but also mainline. Unfortunately we don't get any log files that actually show what is being dumped for the class hierarchy.
11-06-2024

After JDK-8309832, JDK-8310014, JDK-8310622 and JDK-8311161 there seem to be no more lingering threads in the RunThese runs. Looking at the logs from multiple RunThese24H runs it seems like the GC starvation issue is resolved. In no run have I seen a major collection longer than two minutes. While this does not guarantee that we will not hit this assert, it seems very unlikely. So closing this as `Not an Issue` because the root cause of this seems to have been resolved.
05-07-2023

We think that this is something that is caused by the tests massive amount of class loading / unloading in conjunction with Generational ZGC's low frequency of old generation collections. One possible fix could be to run the test with a flag to tell Generational ZGC to run more frequent class unloading old generation collections. However, we can't be 100% that there isn't another bug in the product so I'd like to bump the priority and make sure that we properly investigate this for JDK 21. JDK-8307958 deals with the metaspace verification problem
21-06-2023

We've found that some of the sub-tests started threads that spin looped and ate away CPU resources. More and more of these threads were created the longer the test was running. This starves the GC threads, which causes the GCs to take longer, which causes more and more classes to be loaded, which causes the jcmd buffer to hit the limit described in this bug. We have excluded the problematic tests and have so far not seen this issue. We'll let the JDK 22 run over the weekend and then close this bug, unless this resurfaces again.
21-06-2023

[~dcubed] Yes. We'll take a look at this again. Thanks.
08-06-2023

This bug was previously closed as a duplicate of: JDK-8307958 Metaspace verification is slow causing extreme class unloading times The fix for JDK-8307958 was integrated in jdk-21+25-2115. This latest sighting is in jdk-21+26-2217-tier8. [~stefank] - Would you mind taking another look when you get the chance?
07-06-2023

This is caused by the same problem as JDK-8307958. Generational ZGC delays class unloading, which adds more classes, which makes class unloading verification take longer time, which adds more classes before next class unloading, end so on. And at some point the system has so many classes that the buffer overflows.
15-05-2023