JDK-8359960 : Regression ~3% on multiple Micros-SystemGC-ZGC only on Mac aarch64
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 25
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2025-06-18
  • Updated: 2025-08-04
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26Unresolved
Related Reports
Causes :  
Relates :  
Description
Appeared in retriage for 25-b26.
Seems to be related to JDK-8355003

org.openjdk.bench.vm.gc.systemgc.AllDead.gc -3.68%
org.openjdk.bench.vm.gc.systemgc.DifferentObjectSizesArray.gc -3.49%
Comments
I reran these today with CI 3123 and 3124 (which is JDK-8355003) - with the new improved system gc JMH from JDK-8361520. There are 2 cases with big differences. The others are insignificant or in the 1% range. These are run on M1 Minis. "vm.gc.systemgc.AllDead.gc", 4.12, 4.37, -6.041 "vm.gc.systemgc.NoObjects.gc", 3.87, 4.50, -16.212 -XX:+UseZGC -Xmx5g -Xms5g -Xmn3g -XX:+AlwaysPreTouch -XX:+ZCollectionIntervalOnly
01-08-2025

[~ecaspole] Could you provide the numbers from your runs in a public comment?
31-07-2025

> [~shade] Did you run your experiments on MacOS AArch64? Yes, I did experiments on my Mac M1 with self-built JDK, and basically default (benchmark) options. I did not test with `-XX:+ZCollectionIntervalOnly` that Axel mentioned in recent comment. Since you can reproduce it, can you run AllDead.gc test, show the full JVM options, and results for these two pairs of commits? a) jdk-25+25 vs jdk-25+26 b) e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd vs e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd~1
31-07-2025

[~shade] Did you run your experiments on MacOS AArch64? [~ecaspole] ran with your updated version of the benchmark and we still see a clear regression when the AOT code went in.
29-07-2025

Just a note one the initial results w.r.t. JDK-8361520, all performance testing was run with `-XX:+ZCollectionIntervalOnly` which eliminates the largest source of variance for ZGC. (That is that the System.gc call is done while a warmup GC is in flight). After the warmup iterations added in JDK-8361520, that should no longer be necessary.
28-07-2025

ZGC folks, could you please take a look and decide what to do with this? Looks like Alexey fixed the benchmarks.
10-07-2025

Igor asked me to take a look. I built benchmarks.jar from jdk-25+26, so that we are sure we are running the same benchmark code. Looking at benchmark and its results, I immediately note two things. First, I see the benchmark executes a single shot per fork. As such, I believe the benchmark really tests the cost of initial GC, that probably drags a lot of (potentially non-benchmark-related) objects through new (possibly awkwardly wired, despite +AlwaysPreTouch) memory. The first iteration is 80 ms/op for me here, and the second one is -- whoosh -- only 3 ms/op! Even if there is a regression there, I don't think it reflects a performance hit users would see in the wild. Second, and I think that is related, the benchmark is really, really noisy. That said, it does look like a regression between jdk-25+25 and jdk-25+26: # java -jar benchmarks.jar AllDead.gc --jvmArgsAppend "-XX:+UseZGC" -wf 50 -f 200 jdk-25+24: 75,811 ± 1,228 ms/op jdk-25+25: 76,142 ± 1,369 ms/op jdk-25+26: 79,721 ± 1,547 ms/op jdk-25+27: 79,013 ± 2,127 ms/op mainline: 80,337 ± 1,617 ms/op But I will be hard-pressed to relate it to AOT work. There are plenty of ZGC changes between these two tags. Bisection is fairly hard given the benchmark noise. But I can see there is no regression right at AOT Profiling integration, it actually improves a little: e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd: 76,183 ± 1,401 ms/op e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd~1: 76,688 ± 1,233 ms/op Do we have a concrete evidence AOT Profiling integration caused it? If not, I think we should fix the benchmarks at very least (JDK-8361520), and *maybe* then follow up on ZGC performance.
07-07-2025

This are the normal promo benchmarks, not running AOT or any AOT flags at all, so it seems the introduction of the code somehow affected the GC performance.
20-06-2025

[~ecaspole] please clarify if `AOTClassLinking` or other `AOT` flags are used. Or this regression without AOT.
20-06-2025

Targeting this to JDK 26 for now. [~iveresov], [~kvn] feel free to raise priority and re-target to JDK 25 if appropriate.
20-06-2025

ILW = Small performance regression, GC microbenchmark with ZGC on Mac AArch64, no workaround = MLH = P4
20-06-2025

Igor, could you please have a look?
19-06-2025