JDK-8280029 : G1: "Overflow during reference processing, can not continue" on x86_32
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8,11,17,18,19
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-01-14
  • Updated: 2023-08-28
  • Resolved: 2022-01-18
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19
19 b06Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Description
x86_32 intermittently fails jdk/javadoc/doclet/testLinkPlatform/TestLinkPlatform.java. I finally managed to reproduce it on one local machine with:

$ CONF=linux-x86-server-fastdebug make run-test TEST=jdk/javadoc/doclet/testLinkPlatform/TestLinkPlatform.java TEST_VM_OPTS="-XX:ActiveProcessorCount=2" JTREG="REPEAT_COUNT=100"

# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/shade/shipilev-jdk/src/hotspot/share/gc/g1/g1ConcurrentMark.cpp:1649), pid=2198496, tid=2198505
#  fatal error: Overflow during reference processing, can not continue. Please increase MarkStackSizeMax (current value: 4194304) and restart.
#

Host: core11, 11th Gen Intel(R) Core(TM) i5-11500 @ 2.70GHz, 12 cores, 30G, Debian GNU/Linux 11 (bullseye)
Time: Fri Jan 14 14:47:04 2022 CET elapsed time: 9.418540 seconds (0d 0h 0m 9s)

---------------  T H R E A D  ---------------

Current thread (0xb4060778):  VMThread "VM Thread" [stack: 0xb4180000,0xb4200000] [id=2198505]

Stack: [0xb4180000,0xb4200000],  sp=0xb41fea64,  free space=506k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xb735f8]  G1ConcurrentMark::weak_refs_work()+0x788
V  [libjvm.so+0xb74185]  G1ConcurrentMark::remark()+0x135
V  [libjvm.so+0xc1af0d]  VM_G1PauseConcurrent::doit()+0x1fd
V  [libjvm.so+0x1799b20]  VM_Operation::evaluate()+0x160
V  [libjvm.so+0x17b87fd]  VMThread::evaluate_operation(VM_Operation*)+0x11d
V  [libjvm.so+0x17b9835]  VMThread::inner_execute(VM_Operation*)+0x415
V  [libjvm.so+0x17b9969]  VMThread::loop()+0xb9
V  [libjvm.so+0x17b9a8a]  VMThread::run()+0xba
V  [libjvm.so+0x16b7c0a]  Thread::call_run()+0xfa
V  [libjvm.so+0x12fcc0b]  thread_native_entry(Thread*)+0x11b
C  [libpthread.so.0+0x80b4]  start_thread+0xe4



Comments
I'm seeing this same failure on Linux x86_32 when building a modified branch from tip. I attached a sample log file.
28-08-2023

Changeset: 1725f77b Author: Aleksey Shipilev <shade@openjdk.org> Date: 2022-01-18 14:40:39 +0000 URL: https://git.openjdk.java.net/jdk/commit/1725f77bcd6528d56960a0796fcea3725cc98b6a
18-01-2022

Bumping to -XX:MarkStackSize=64K (from 32K) reliably avoids this bug. I believe this is a bona-fide problem in G1: queue overflow happens during Remark Pause, when it is too late to resize the queues and restart the concurrent mark. The error message is a bit misleading, should also say about the default (MarkStackSize) as the option to try. [8.961s][info ][gc ] GC(27) Concurrent Mark Cycle [9.038s][info ][gc,start] GC(27) Pause Remark ; now in G1CMMarkStack::par_push_chunk, that would do set_overflow(true) next [9.082s][debug][gc ] GC(27) Cannot allocate, _hwm >= _chunk_capacity; _hwm = 32, _chunk_capacity = 32 ; [9.082s][debug][gc ] GC(27) Cannot allocate, capacity = 32, max = 4096
17-01-2022

Wait, so the failure happens when actual mark stack size is 32736, while MarkStackSizeMax is configured at 4194304. That implies the mark stack is not growing up to "max", and we run out of 32K of stack? My previous experiment with -XX:MarkStackSizeMax=512M would not solve it. But now I tried with -XX:MarkStackSize=4M (not "Max"), and the test passes! That's a big clue.
17-01-2022

At the time of crash, mark stack is full of ZipFileSystem$IndexNode instances: [9.001s][info][gc] GC(20) Pause Young (Concurrent Start) (G1 Humongous Allocation) 209M->152M(306M) 58.205ms [9.001s][info][gc] GC(21) Concurrent Mark Cycle [9.125s][info][gc] GC(21) Object: 0xb762f180, jdk.nio.zipfs.ZipFileSystem$IndexNode [9.125s][info][gc] GC(21) Object: 0xb762f120, jdk.nio.zipfs.ZipFileSystem$IndexNode ... [9.210s][info][gc] GC(21) Object: 0xba276480, jdk.nio.zipfs.ZipFileSystem$IndexNode [9.210s][info][gc] GC(21) Object: 0xba276378, jdk.nio.zipfs.ZipFileSystem$IndexNode # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/g1ConcurrentMark.cpp:1681 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/shade/shipilev-jdk/src/hotspot/share/gc/g1/g1ConcurrentMark.cpp:1681), pid=3946432, tid=3946442 # fatal error: Overflow during reference processing, can not continue. Please increase MarkStackSizeMax (current value: 4194304) and restart. Mark stack size: 32736
17-01-2022

G1 log that precedes the crash is: [8.508s][info][gc] GC(17) Pause Young (Normal) (G1 Evacuation Pause) 219M->144M(306M) 86.416ms [8.760s][info][gc] GC(18) Pause Young (Concurrent Start) (G1 Humongous Allocation) 187M->141M(306M) 63.117ms [8.760s][info][gc] GC(19) Concurrent Undo Cycle [8.762s][info][gc] GC(19) Concurrent Undo Cycle 2.116ms [8.846s][info][gc] GC(20) Pause Young (Concurrent Start) (G1 Humongous Allocation) 157M->136M(306M) 19.597ms [8.846s][info][gc] GC(21) Concurrent Undo Cycle [8.847s][info][gc] GC(21) Concurrent Undo Cycle 0.767ms [9.016s][info][gc] GC(22) Pause Young (Concurrent Start) (G1 Humongous Allocation) 169M->145M(306M) 34.107ms [9.016s][info][gc] GC(23) Concurrent Mark Cycle <crash>
17-01-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/7109 Date: 2022-01-17 12:24:05 +0000
17-01-2022

It is quite hard to believe, but bisect shows it started with: commit 09831e7aa47ebe41eab2f3014ebbacf338097ef6 Author: Joe Darcy <darcy@openjdk.org> Date: Thu Dec 9 17:01:59 2021 +0000 8273146: Start of release updates for JDK 19 8277511: Add SourceVersion.RELEASE_19 8277513: Add source 19 and target 19 to javac Reviewed-by: dholmes, alanb, erikj, iris, mikael, ihse
14-01-2022

Bumping to -XX:MarkStackSizeMax=512M (default for x86_64) does not help, which suggests it is not just "running out of stack". Something is really fishy here.
14-01-2022