JDK-8359960 : Regression ~3% on multiple Micros-SystemGC-ZGC only on Mac aarch64
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 25
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2025-06-18
  • Updated: 2025-07-07
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 26
26Unresolved
Related Reports
Causes :  
Relates :  
Description
Appeared in retriage for 25-b26.
Seems to be related to JDK-8355003

org.openjdk.bench.vm.gc.systemgc.AllDead.gc -3.68%
org.openjdk.bench.vm.gc.systemgc.DifferentObjectSizesArray.gc -3.49%
Comments
Igor asked me to take a look. I built benchmarks.jar from jdk-25+26, so that we are sure we are running the same benchmark code. Looking at benchmark and its results, I immediately note two things. First, I see the benchmark executes a single shot per fork. As such, I believe the benchmark really tests the cost of initial GC, that probably drags a lot of (potentially non-benchmark-related) objects through new (possibly awkwardly wired, despite +AlwaysPreTouch) memory. The first iteration is 80 ms/op for me here, and the second one is -- whoosh -- only 3 ms/op! Even if there is a regression there, I don't think it reflects a performance hit users would see in the wild. Second, and I think that is related, the benchmark is really, really noisy. That said, it does look like a regression between jdk-25+25 and jdk-25+26: # java -jar benchmarks.jar AllDead.gc --jvmArgsAppend "-XX:+UseZGC" -wf 50 -f 200 jdk-25+24: 75,811 ± 1,228 ms/op jdk-25+25: 76,142 ± 1,369 ms/op jdk-25+26: 79,721 ± 1,547 ms/op jdk-25+27: 79,013 ± 2,127 ms/op mainline: 80,337 ± 1,617 ms/op But I will be hard-pressed to relate it to AOT work. There are plenty of ZGC changes between these two tags. Bisection is fairly hard given the benchmark noise. But I can see there is no regression right at AOT Profiling integration, it actually improves a little: e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd: 76,183 ± 1,401 ms/op e3f85c961b4c1e5e01aedf3a0f4e1b0e6ff457fd~1: 76,688 ± 1,233 ms/op Do we have a concrete evidence AOT Profiling integration caused it? If not, I think we should fix the benchmarks at very least (JDK-8361520), and *maybe* then follow up on ZGC performance.
07-07-2025

This are the normal promo benchmarks, not running AOT or any AOT flags at all, so it seems the introduction of the code somehow affected the GC performance.
20-06-2025

[~ecaspole] please clarify if `AOTClassLinking` or other `AOT` flags are used. Or this regression without AOT.
20-06-2025

Targeting this to JDK 26 for now. [~iveresov], [~kvn] feel free to raise priority and re-target to JDK 25 if appropriate.
20-06-2025

ILW = Small performance regression, GC microbenchmark with ZGC on Mac AArch64, no workaround = MLH = P4
20-06-2025

Igor, could you please have a look?
19-06-2025