JDK-8262465 : Very long compilation times and high memory consumption in C2 debug builds
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 11,16,17
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2021-02-26
  • Updated: 2021-04-08
  • Resolved: 2021-03-23
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 17
11.0.12-oracleFixed 17 b15Fixed
Related Reports
Relates :  
Description
This has been observed with the latest changes to the JTreg test case  test/hotspot/jtreg/compiler/intrinsics/TestRotate.java.
Make sure to use the following version of this test case, as it has been changed with JDK-8256438:
https://github.com/openjdk/jdk/blob/382e38dd246596ec94a1f1ce0e0f9e87f53366c7/test/hotspot/jtreg/compiler/intrinsics/TestRotate.java

I don't believe the problem has been introduced with this change, but the change to the test triggers this behavior.

JIT compilation of the main method of this JTreg test consumes an unusual amount of memory and takes a very long time.
Judging from NMT, C2 consumes up to 2978MB of memory (on x86_64 linux):

Total: reserved=12595MB, committed=3576MB
-                 Java Heap (reserved=7952MB, committed=500MB)
                            (mmap: reserved=7952MB, committed=500MB) 
 
-                    Thread (reserved=30MB, committed=1MB)
                            (thread #30)
                            (stack: reserved=30MB, committed=1MB)
 
-                      Code (reserved=242MB, committed=8MB)
                            (mmap: reserved=242MB, committed=7MB) 
 
-                        GC (reserved=355MB, committed=79MB)
                            (malloc=27MB #851) 
                            (mmap: reserved=328MB, committed=51MB) 
 
-                  Compiler (reserved=2978MB, committed=2978MB)
                            (arena=2978MB #13)
 
-                    Symbol (reserved=2MB, committed=2MB)
                            (malloc=1MB #24825) 
                            (arena=1MB #1)
 
-    Native Memory Tracking (reserved=2MB, committed=2MB)
                            (tracking overhead=1MB)
 
-               Arena Chunk (reserved=1MB, committed=1MB)
                            (malloc=1MB) 
 
-                 Metaspace (reserved=8MB, committed=4MB)
                            (mmap: reserved=8MB, committed=4MB) 
 
-                   Unknown (reserved=1024MB, committed=0MB)
                            (mmap: reserved=1024MB, committed=0MB) 

Judging from the compilation log, compilation of the main method takes a huge amount of time (if compilation is restricted to this main method, using -XX:CompileCommand):
<task compile_id="41" compile_kind="osr" method="compiler.intrinsics.TestRotate main ([Ljava/lang/String;)V" bytes="1568" count="1" backedge_count="101377" iicount="1" osr_bci="774" stamp="18.349">
<task_done success="1" nmsize="719360" count="1" backedge_count="1093588" inlined_bytes="59937" stamp="194.437"/>

You can compile and run the test case directly, without JTreg:
$ javac test/hotspot/jtreg/compiler/intrinsics/TestRotate.java
$ java -cp test/hotspot/jtreg compiler.intrinsics.TestRotate

Compilation can be limited to only the main methoid using -XX:CompileCommand='compileonly compiler.intrinsics.TestRotate::main'

Comments
Fix Request (11u) This improves the pathological cases with debug builds by guarding the costly verification step with a compiler flag. Patch does not apply cleanly to 11u due to different context. 11u RFR (acked by phh): https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2021-April/005595.html
07-04-2021

Changeset: 2da882c0 Author: Nils Eliasson <neliasso@openjdk.org> Date: 2021-03-23 07:45:48 +0000 URL: https://git.openjdk.java.net/jdk/commit/2da882c0
23-03-2021

A small bit of debug code cause a huge overhead Product C2 Compile Time: 17,885 s Parse: 0,046 s Optimize: 5,417 s Escape Analysis: 0,024 s Conn Graph: 0,006 s Macro Eliminate: 0,053 s GVN 1: 0,013 s Incremental Inline: 0,238 s IdealLoop: 0,000 s IGVN: 0,096 s Inline: 0,115 s Prune Useless: 0,026 s Other: 0,000 s Renumber Live: 0,007 s Vector: 0,000 s Box elimination: 0,000 s IGVN: 0,000 s Prune Useless: 0,000 s IdealLoop: 4,728 s IdealLoop Verify: 0,000 s Cond Const Prop: 0,004 s GVN 2: 0,003 s Macro Expand: 0,181 s Barrier Expand: 0,000 s Graph Reshape: 0,006 s Other: 0,214 s Matcher: 0,224 s Post Selection Cleanup: 0,000 s Scheduler: 0,244 s Regalloc: 11,836 s Ctor Chaitin: 0,001 s Build IFG (virt): 0,034 s Build IFG (phys): 0,592 s Compute Liveness: 0,705 s Regalloc Split: 6,572 s Postalloc Copy Rem: 0,189 s Merge multidefs: 0,014 s Fixup Spills: 0,003 s Compact: 0,002 s Coalesce 1: 0,579 s Coalesce 2: 0,106 s Coalesce 3: 0,046 s Cache LRG: 0,006 s Simplify: 1,975 s Select: 0,767 s Other: 0,244 s Block Ordering: 0,015 s Peephole: 0,003 s Code Emission: 0,093 s Insn Scheduling: 0,000 s Build OOP maps: 0,041 s Code Installation: 0,009 s Total compiled methods : 695 methods Standard compilation : 691 methods On stack replacement : 4 methods Total compiled bytecodes : 155130 bytes Standard compilation : 52719 bytes On stack replacement : 102411 bytes Average compilation speed : 8508 bytes/s nmethod code size : 2697456 bytes nmethod total size : 2014496 bytes Debug C2 Compile Time: 294,300 s Parse: 0,161 s Optimize: 9,156 s Escape Analysis: 0,077 s Conn Graph: 0,013 s Macro Eliminate: 0,137 s GVN 1: 0,031 s Incremental Inline: 0,507 s IdealLoop: 0,000 s IGVN: 0,118 s Inline: 0,360 s Prune Useless: 0,029 s Other: 0,000 s Renumber Live: 0,012 s Vector: 0,000 s Box elimination: 0,000 s IGVN: 0,000 s Prune Useless: 0,000 s IdealLoop: 7,458 s IdealLoop Verify: 0,194 s Cond Const Prop: 0,007 s GVN 2: 0,008 s Macro Expand: 0,315 s Barrier Expand: 0,000 s Graph Reshape: 0,014 s Other: 0,532 s Matcher: 0,601 s Post Selection Cleanup: 0,007 s Scheduler: 0,501 s Regalloc: 283,185 s Ctor Chaitin: 0,001 s Build IFG (virt): 0,142 s Build IFG (phys): 6,320 s Compute Liveness: 1,732 s Regalloc Split: 255,514 s Postalloc Copy Rem: 0,478 s Merge multidefs: 0,021 s Fixup Spills: 0,005 s Compact: 0,005 s Coalesce 1: 1,019 s Coalesce 2: 0,185 s Coalesce 3: 0,431 s Cache LRG: 0,014 s Simplify: 10,267 s Select: 5,905 s Other: 1,145 s Block Ordering: 0,015 s Peephole: 0,003 s Code Emission: 0,661 s Insn Scheduling: 0,000 s Build OOP maps: 0,095 s Code Installation: 0,064 s Debug with fix C2 Compile Time: 64,324 s Parse: 0,183 s Optimize: 9,435 s Escape Analysis: 0,084 s Conn Graph: 0,014 s Macro Eliminate: 0,195 s GVN 1: 0,038 s Incremental Inline: 0,568 s IdealLoop: 0,000 s IGVN: 0,137 s Inline: 0,400 s Prune Useless: 0,031 s Other: 0,000 s Renumber Live: 0,014 s Vector: 0,000 s Box elimination: 0,000 s IGVN: 0,000 s Prune Useless: 0,000 s IdealLoop: 7,606 s IdealLoop Verify: 0,182 s Cond Const Prop: 0,007 s GVN 2: 0,008 s Macro Expand: 0,316 s Barrier Expand: 0,000 s Graph Reshape: 0,014 s Other: 0,599 s Matcher: 0,573 s Post Selection Cleanup: 0,007 s Scheduler: 0,495 s Regalloc: 52,948 s Ctor Chaitin: 0,001 s Build IFG (virt): 0,144 s Build IFG (phys): 6,386 s Compute Liveness: 1,697 s Regalloc Split: 25,477 s Postalloc Copy Rem: 0,497 s Merge multidefs: 0,022 s Fixup Spills: 0,005 s Compact: 0,005 s Coalesce 1: 0,992 s Coalesce 2: 0,189 s Coalesce 3: 0,434 s Cache LRG: 0,014 s Simplify: 10,083 s Select: 5,852 s Other: 1,149 s Block Ordering: 0,018 s Peephole: 0,004 s Code Emission: 0,653 s Insn Scheduling: 0,000 s Build OOP maps: 0,096 s Code Installation: 0,059 s
22-03-2021

Increasing the block size improves the IndexSet stats but give no obvious improvement in compile time: Accumulated IndexSet usage statistics: -------------------------------------- Iteration: blocks visited: 915862035 blocks empty: 28,08% bit density (bits/used blocks): 49,73 bit density (bits/all blocks): 35,77 Allocation: blocks allocated: 2071700 blocks used/reused: 2764537
19-03-2021

Accumulated IndexSet usage statistics: -------------------------------------- Iteration: blocks visited: 189405589674 blocks empty: 98,62% bit density (bits/used blocks): 12,57 bit density (bits/all blocks): 0,17 Allocation: blocks allocated: 7952700 blocks used/reused: 10224658
18-03-2021

80% of time is spent in RegAlloc and 50% of RegAlloc is spent in Split. C2 Compile Time: 49,808 s Parse: 0,029 s Optimize: 8,418 s Escape Analysis: 0,023 s Conn Graph: 0,009 s Macro Eliminate: 0,085 s GVN 1: 0,014 s Incremental Inline: 0,354 s IdealLoop: 0,000 s IGVN: 0,125 s Inline: 0,182 s Prune Useless: 0,047 s Other: 0,000 s Renumber Live: 0,013 s Vector: 0,000 s Box elimination: 0,000 s IGVN: 0,000 s Prune Useless: 0,000 s IdealLoop: 7,397 s IdealLoop Verify: 0,000 s Cond Const Prop: 0,005 s GVN 2: 0,003 s Macro Expand: 0,269 s Barrier Expand: 0,000 s Graph Reshape: 0,008 s Other: 0,333 s Matcher: 0,406 s Post Selection Cleanup: 0,000 s Scheduler: 0,336 s Regalloc: 40,482 s Ctor Chaitin: 0,001 s Build IFG (virt): 0,052 s Build IFG (phys): 1,869 s Compute Liveness: 1,532 s Regalloc Split: 21,119 s Postalloc Copy Rem: 0,380 s Merge multidefs: 0,017 s Fixup Spills: 0,003 s Compact: 0,003 s Coalesce 1: 0,836 s Coalesce 2: 0,113 s Coalesce 3: 0,183 s Cache LRG: 0,013 s Simplify: 9,648 s Select: 3,500 s Other: 1,213 s Block Ordering: 0,013 s Peephole: 0,003 s Code Emission: 0,117 s Insn Scheduling: 0,000 s Build OOP maps: 0,055 s Code Installation: 0,006 s
18-03-2021

Note I am fixing the test in JDK-8262950. If anyone wants to follow up on C2 behavior, revert that patch to get the test shape back.
03-03-2021

ILW = Slow compilation and high memory consumption with C2, only observed with single jtreg test, disable C2 compilation of affected method = MMM = P3
02-03-2021