JDK-8332225 : Fix failed for 8328744:Parallel GC throws OOM before heap is fully expanded
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 22.0.2,23
  • Priority: P2
  • Status: Closed
  • Resolution: Not an Issue
  • Submitted: 2024-05-14
  • Updated: 2024-05-20
  • Resolved: 2024-05-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 23
23Resolved
Related Reports
Relates :  
Relates :  
Description
Ran attached test case with following command line:
-Xmx2G -XX:+UseParallelGC -XX:NativeMemoryTracking=summary -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics gc

After test is terminated with OOM, NMT statistics shows:
- Java Heap (reserved=2147483648, committed=1270874112)
                            (mmap: reserved=2147483648, committed=1270874112, peak=2147483648)

 The heap has not yet fully expanded.


Verification Results from different platforms:

Mac-aarch64
Before fix-22.0.2+3 - FAIL
After fix-22.0.2+4 - PASS

Before fix 23-b15 - FAIL
After fix 23-b16 - PASS

Result from fixed version 23-b16  in Mac-aarch64
 Java Heap (reserved=2147483648, committed=2147483648)
                            (mmap: reserved=2147483648, committed=2147483648, at peak)
Heap is fully expanded.

Linux-aarch64
Fixed version 22.0.2+4 - FAIL
Fixed version 23-b16 - FAIL

linux-x64
Fixed version 22.0.2+4 - FAIL
Fixed version 23-b16 - FAIL

Windows
Fixed version 22.0.2+4 - FAIL
Fixed version 23-b16 - FAIL
When we tried running multiple times in windows platform, it passed once out of 6times. 
Comments
The symptom is a result of parallel GC that shrinks heap too aggressively, it will be addressed by JDK-8332531.
20-05-2024

> to address aggressiveness resizing heap That sounds like an enhancement, while this is or at least believed to be a bug by the reporter. Maybe this can be closed as not-a-issue, and you can create a new ticket on after-gc-resizing. Just my 2c.
20-05-2024

Ah, make sense. Actually, next item on my list is to address aggressiveness resizing heap by Parallel GC. Our application is transaction based. Upon finishing a transaction, it is likely to release quite amount of memory. If a GC happens, it likely shrinks heap. Once a new transaction starts to execute, heap starts to expand again. Shrinking and expanding heap is expensive and unnecessary, also impacts on pause time (I would like to measure in next a few days) While we can now workaround it by setting -Xms = -Xmx after JDK-8332418, I believe we should still fix it. I may repurpose this CR to address frequently resizing of heap. @ayang what you think?
19-05-2024

> After throwing OOM(?), last GC cycle (44) shows that old gen resized to minimum old size. How is this possible? because the heap should still be full - the test case increases live set monotonically. It's the same behavior for some other GCs as well -- little live set in final GC. For example: ## Serial GC(11) Pause Full (Allocation Failure) 832M->32M(989M) 222.664ms ## G1 GC(38) Pause Full (G1 Compaction Pause) 1023M->3M(500M) 9.123ms I believe it's because when an OOM error is thrown, we exit the main method, which means the stack variable `map` is no longer alive. Regarding the statement "The heap has not yet fully expanded" in the bug-report, it is likely that the Parallel GC is more eager to shrink or uncommit the heap after each GC pause. This behavior is more of a feature than a bug, IMO. In summary, one cannot assume the heap is still fully expanded *after* an OOM error is thrown. The correct invariant is that the heap is fully expanded *before* the OOM error is thrown.
18-05-2024

It seems that Parallel GC does something funny. Run the test case: java -Xmx1g -XX:+UseParallelGC -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -Xlog:gc+heap=trace -XX:+PrintNMTStatistics gc [0.001s][trace][gc,heap] Initial heap size 536870912 [0.001s][trace][gc,heap] Minimum heap size 6815736 [0.007s][debug][gc,heap] Minimum heap 8388608 Initial heap 536870912 Maximum heap 1073741824 [0.007s][trace][gc,heap] 1: Minimum young 2621440 Initial young 178782208 Maximum young 357564416 [0.007s][trace][gc,heap] Minimum old 5767168 Initial old 358088704 Maximum old 716177408 .... [38.173s][debug][gc,heap] GC(43) Heap after GC invocations=44 (full 40): [38.173s][debug][gc,heap] GC(43) PSYoungGen total 327680K, used 306172K [0x00000007eab00000, 0x0000000800000000, 0x0000000800000000) [38.173s][debug][gc,heap] GC(43) eden space 306176K, 99% used [0x00000007eab00000,0x00000007fd5ff3e8,0x00000007fd600000) [38.173s][debug][gc,heap] GC(43) from space 21504K, 0% used [0x00000007fd600000,0x00000007fd600000,0x00000007feb00000) [38.173s][debug][gc,heap] GC(43) to space 21504K, 0% used [0x00000007feb00000,0x00000007feb00000,0x0000000800000000) [38.173s][debug][gc,heap] GC(43) ParOldGen total 699392K, used 699102K [0x00000007c0000000, 0x00000007eab00000, 0x00000007eab00000) [38.173s][debug][gc,heap] GC(43) object space 699392K, 99% used [0x00000007c0000000,0x00000007eaab78b0,0x00000007eab00000) [38.173s][debug][gc,heap] GC(43) Metaspace used 191K, committed 384K, reserved 1114112K [38.173s][debug][gc,heap] GC(43) class space used 4K, committed 128K, reserved 1048576K Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded [38.175s][debug][gc,heap] GC(44) Heap before GC invocations=45 (full 41): [38.175s][debug][gc,heap] GC(44) PSYoungGen total 327680K, used 306176K [0x00000007eab00000, 0x0000000800000000, 0x0000000800000000) [38.175s][debug][gc,heap] GC(44) eden space 306176K, 100% used [0x00000007eab00000,0x00000007fd600000,0x00000007fd600000) [38.175s][debug][gc,heap] GC(44) from space 21504K, 0% used [0x00000007fd600000,0x00000007fd600000,0x00000007feb00000) [38.175s][debug][gc,heap] GC(44) to space 21504K, 0% used [0x00000007feb00000,0x00000007feb00000,0x0000000800000000) [38.175s][debug][gc,heap] GC(44) ParOldGen total 699392K, used 699102K [0x00000007c0000000, 0x00000007eab00000, 0x00000007eab00000) [38.175s][debug][gc,heap] GC(44) object space 699392K, 99% used [0x00000007c0000000,0x00000007eaab78b0,0x00000007eab00000) [38.175s][debug][gc,heap] GC(44) Metaspace used 203K, committed 384K, reserved 1114112K [38.175s][debug][gc,heap] GC(44) class space used 6K, committed 128K, reserved 1048576K [38.205s][info ][gc,heap] GC(44) PSYoungGen: 306176K(327680K)->0K(327680K) Eden: 306176K(306176K)->0K(306176K) From: 0K(21504K)->0K(21504K) [38.205s][info ][gc,heap] GC(44) ParOldGen: 699102K(699392K)->1186K(545280K) [38.205s][debug][gc,heap] GC(44) Heap after GC invocations=45 (full 41): [38.205s][debug][gc,heap] GC(44) PSYoungGen total 327680K, used 0K [0x00000007eab00000, 0x0000000800000000, 0x0000000800000000) [38.205s][debug][gc,heap] GC(44) eden space 306176K, 0% used [0x00000007eab00000,0x00000007eab00000,0x00000007fd600000) [38.205s][debug][gc,heap] GC(44) from space 21504K, 0% used [0x00000007feb00000,0x00000007feb00000,0x0000000800000000) [38.205s][debug][gc,heap] GC(44) to space 21504K, 0% used [0x00000007fd600000,0x00000007fd600000,0x00000007feb00000) [38.205s][debug][gc,heap] GC(44) ParOldGen total 545280K, used 1186K [0x00000007c0000000, 0x00000007e1480000, 0x00000007eab00000) [38.205s][debug][gc,heap] GC(44) object space 545280K, 0% used [0x00000007c0000000,0x00000007c0128978,0x00000007e1480000) [38.205s][debug][gc,heap] GC(44) Metaspace used 203K, committed 384K, reserved 1114112K [38.205s][debug][gc,heap] GC(44) class space used 6K, committed 128K, reserved 1048576K at gc.main(gc.java:10) Native Memory Tracking: Total: reserved=2599508604, committed=1052056188 malloc: 33217148 #37238 mmap: reserved=2566291456, committed=1018839040 - Java Heap (reserved=1073741824, committed=915931136) (mmap: reserved=1073741824, committed=915931136, peak=1073741824) After throwing OOM(?), last GC cycle (44) shows that old gen resized to minimum old size. How is this possible? because the heap should still be full - the test case increases live set monotonically.
17-05-2024