JDK-8325202 : gc/g1/TestMarkStackOverflow.java intermittently crash: G1CMMarkStack::ChunkAllocator::allocate_new_chunk
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 23
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86_64
  • Submitted: 2024-02-05
  • Updated: 2024-07-04
  • Resolved: 2024-02-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 23
23 b12Fixed
Related Reports
Relates :  
Description
test command:

export test=test/hotspot/jtreg/gc/g1/TestMarkStackOverflow.java
function runJtreg() { jtreg -ea -esa -timeoutFactor:4 -v:fail,error,time,nopass -nr -w $dir/index-$1 $test &> $dir/$1.log ; if [[ 0 -ne $? ]] ; then echo -n "$1 " ; else rm
-rf $dir/index-$1 $dir/$1.log ; fi ; } ; export -f runJtreg ; export dir="tmp-jtreg-"`basename ${test##* } .java | sed "s|#|_|"` ; rm -rf $dir ; mkdir -p $dir ; time seq 100000 | xargs -i -n 1 -P `npr
oc` bash -c "runJtreg {}" ; echo total fail number: `ls $dir/*.log 2> /dev/null | wc | awk '{print $1}'`


result:
command: main -XX:ActiveProcessorCount=2 -XX:MarkStackSize=1 -Xmx250m gc.g1.TestMarkStackOverflow
reason: User specified action: run main/othervm -XX:ActiveProcessorCount=2 -XX:MarkStackSize=1 -Xmx250m gc.g1.TestMarkStackOverflow 
started: Sun Feb 04 10:25:52 CST 2024
Mode: othervm [/othervm specified]
finished: Sun Feb 04 10:25:56 CST 2024
elapsed time (seconds): 3.987
configuration:
STDOUT:
Used mem 18.47 MB
Used mem 36.23 MB
Used mem 53.43 MB
Used mem 70.63 MB
Used mem 87.83 MB
Used mem 105.03 MB
Used mem 122.87 MB
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f161b429e51, pid=340136, tid=340141
#
# JRE version: OpenJDK Runtime Environment (23.0) (build 23)
# Java VM: OpenJDK 64-Bit Server VM (23, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x7e9e51]  G1CMMarkStack::ChunkAllocator::allocate_new_chunk()+0xa1
#
# Core dump will be written. Default location: /var/tmp/tone/run/jtreg/jt-work/hotspot_jtreg/gc/g1/TestMarkStackOverflow/core.340136
#
# An error report file with more information is saved as:
# /var/tmp/tone/run/jtreg/jt-work/hotspot_jtreg/gc/g1/TestMarkStackOverflow/hs_err_pid340136.log
#
# If you would like to submit a bug report, please visit:
#   mailto:yansendao.ysd@alibaba-inc.com
#


Recurrence probability: 1/100k

Failure Mode:
Assume current stack capacity is 1:

1: Thread 1: Obtains cur_idx as 1 and notices that the associated bucket is not allocated, indicating insufficient capacity. It then attempts to double the capacity to 2.
2: Thread 2: Obtains cur_idx as 2 and finds that the bucket associated with index 2 is also not allocated. Consequently, Thread 2 also tries to expand the stack.
3: Due to a delay in Thread 1's execution, Thread 2 acquires the lock first and initiates the expansion by calling the expand() function. This function doubles the capacity from 1 to 2.
4: However, upon returning from the expansion, the bucket associated with cur_idx 2 remains unallocated. Consequently, when Thread 2 tries to access this bucket, it crashes.

The problem is that the expand() function is called without considering the specific thread context that necessitated the expansion. Instead, it expands the capacity based on the current size of the stack, leading to conflicts when multiple threads concurrently attempt to expand the stack's capacity.



Comments
This looks like a regression from JDK-8280087, right?
27-02-2024

Changeset: 11fdca06 Author: Ivan Walulya <iwalulya@openjdk.org> Date: 2024-02-23 10:48:50 +0000 URL: https://git.openjdk.org/jdk/commit/11fdca06345542b8d5e54feb1d16f17c2bcb1a82
23-02-2024

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/17912 Date: 2024-02-19 11:04:16 +0000
19-02-2024