On some machines (x64 and aarch), compiler/c2/irTests/scalarReplacement/AllocationMergesTests.java is failing due to allocations that could not have been removed with -XX:-TieredCompilation
Playing around with different warm-ups (i.e. -DWarmup=1000,2000,10000 etc.) I get a different amount of failures. This suggests that on some machines, the number of warm-ups is enough for the test to work while on others it's not. However, when choosing a very high number of warm-ups (i.e. 10000), I even got 5 failures.
We should check the root cause of being unable to remove allocations with a different number of warm-up iterations and fix the test/code accordingly.
Output:
Compilation of Failed Method
----------------------------
1) Compilation of "int compiler.c2.irTests.scalarReplacement.AllocationMergesTests.testNoEscapeWithLoadInLoop_C2(boolean,int,int)":
> Phase "PrintOptoAssembly":
----------------------- MetaData before Compile_id = 374 ------------------------
{method}
- this oop: 0x00007f1a134210d8
- method holder: 'compiler/c2/irTests/scalarReplacement/AllocationMergesTests'
- constants: 0x00007f1a1341c000 constant pool [641] {0x00007f1a1341c000} for 'compiler/c2/irTests/scalarReplacement/AllocationMergesTests' cache=0x00007f1a13424780
- access: 0x0
- flags: 0x5080 queued_for_compilation dont_inline has_loops_flag_init
- name: 'testNoEscapeWithLoadInLoop_C2'
- signature: '(ZII)I'
- max stack: 5
- max locals: 4
- size of params: 4
- method size: 14
- vtable index: 13
- i2i entry: 0x00007f1a98b50a40
- adapters: AHE@0x00007f1aa42061b0: 0xbaaa i2c: 0x00007f1a98bb8600 c2i: 0x00007f1a98bb86f7 c2iUV: 0x00007f1a98bb86c5 c2iNCI: 0x00007f1a98bb8731
- compiled entry 0x00007f1a98bb86f7
- code size: 8
- code start: 0x00007f1a134210c0
- code end (excl): 0x00007f1a134210c8
- method data: 0x00007f1a13499d20
- checked ex length: 0
- linenumber start: 0x00007f1a134210c8
- localvar length: 0
------------------------ OptoAssembly for Compile_id = 374 -----------------------
#
# int ( compiler/c2/irTests/scalarReplacement/AllocationMergesTests:NotNull *, int, int, int )
#
000 N273: # out( B1 ) <- BLOCK HEAD IS JUNK Freq: 1
000 movl rscratch1, [j_rarg0 + oopDesc::klass_offset_in_bytes()] # compressed klass
decode_klass_not_null rscratch1, rscratch1
cmpq rax, rscratch1 # Inline cache check
jne SharedRuntime::_ic_miss_stub
nop # nops to align entry point
nop # 4 bytes pad for loops and calls
020 B1: # out( B12 B2 ) <- BLOCK HEAD IS JUNK Freq: 1
020 # stack bang (304 bytes)
pushq rbp # Save rbp
subq rsp, #64 # Create frame
03a movl [rsp + #12], R8 # spill
03f movl [rsp + #8], RCX # spill
043 movq [rsp + #0], RSI # spill
047 movl [rsp + #16], RDX # spill
04b testl RDX, RDX
04d je B12 P=0.100000 C=-1.000000
053 B2: # out( B13 B3 ) <- in( B1 ) Freq: 0.9
053 # TLS is in R15
053 movq RAX, [R15 + #456 (32-bit)] # ptr
05a movq R10, RAX # spill
05d addq R10, #24 # ptr
061 cmpq R10, [R15 + #472 (32-bit)] # raw ptr
068 jae,u B13 P=0.000100 C=-1.000000
06e B3: # out( B4 ) <- in( B2 ) Freq: 0.89991
06e movq [R15 + #456 (32-bit)], R10 # ptr
075 PREFETCHW [R10 + #192 (32-bit)] # Prefetch allocation into level 1 cache and mark modified
07d movq [RAX], #1 # long
084 movl [RAX + #8 (8-bit)], narrowklass: precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # compressed klass ptr
08b movl [RAX + #12 (8-bit)], R12 # int (R12_heapbase==0)
08f movq [RAX + #16 (8-bit)], R12 # long (R12_heapbase==0)
093 B4: # out( B16 B5 ) <- in( B14 B3 ) Freq: 0.9
093
093 MEMBAR-storestore (empty encoding)
093 movq RBP, RAX # spill
096 # checkcastPP of RBP
096 movq RSI, RBP # spill
099 movl RDX, [rsp + #12] # spill
09d movl RCX, [rsp + #8] # spill
nop # 2 bytes pad for loops and calls
0a3 call,static compiler.c2.irTests.scalarReplacement.AllocationMergesTests$Point::<init>
# compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop @ bci:24 (line 949) L[0]=rsp + #0 L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12 L[4]=#ScObj0 L[5]=#0 L[6]=_ STK[0]=RBP
# ScObj0 compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point={ [x :0]=rsp + #8, [y :1]=rsp + #12 }
# compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop_C2 @ bci:4 (line 962) L[0]=rsp + #0 L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12
# OopMap {rbp=Oop [0]=Oop off=168/0xa8}
0b0 B5: # out( B6 ) <- in( B4 ) Freq: 0.899982
# Block is sole successor of call
0b0 movl R10, [RBP + #16 (8-bit)] # int ! Field: compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point.y
0b4 movl RAX, [RBP + #12 (8-bit)] # int ! Field: compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point.x
0b7 B6: # out( B8 ) <- in( B5 B12 ) Freq: 0.999982
0b7 leal R11, [RAX + R10]
0bb leal R8, [R11 + #3342]
0c2 movl R9, #3343 # int
0c8 jmp,s B8
nop # 6 bytes pad for loops and calls
0d0 B7: # out( B8 ) <- in( B8 ) top-of-loop Freq: 903.513
0d0 movl R9, RCX # spill
0d3 B8: # out( B7 B9 ) <- in( B6 B7 ) Loop( B8-B7 inner main of N10) Freq: 904.513
0d3 leal RBX, [R9 + R11]
0d7 addl R8, RBX # int
0da addl R8, RBX # int
0dd addl R8, RBX # int
0e0 addl R8, RBX # int
0e3 addl R8, RBX # int
0e6 addl R8, RBX # int
0e9 addl R8, RBX # int
0ec addl R8, RBX # int
0ef addl R8, RBX # int
0f2 addl R8, RBX # int
0f5 addl R8, RBX # int
0f8 addl R8, RBX # int
0fb addl R8, RBX # int
0fe addl R8, RBX # int
101 addl R8, RBX # int
104 addl R8, RBX # int
107 addl R8, RBX # int
10a addl R8, RBX # int
10d addl R8, RBX # int
110 addl R8, RBX # int
113 addl R8, RBX # int
116 addl R8, RBX # int
119 addl R8, RBX # int
11c addl R8, RBX # int
11f addl R8, RBX # int
122 addl R8, RBX # int
125 addl R8, RBX # int
128 addl R8, RBX # int
12b addl R8, RBX # int
12e addl R8, RBX # int
131 addl R8, RBX # int
134 addl R8, RBX # int
137 addl R8, #496 # int
13e leal RCX, [R9 + #32]
142 cmpl RCX, #4207
148 jl,s B7 # loop end P=0.998894 C=20781.000000
14a B9: # out( B10 ) <- in( B8 ) Freq: 0.999982
14a # castII of R9
14a addl R9, #32 # int
14e B10: # out( B10 B11 ) <- in( B9 B10 ) Loop( B10-B10 inner post of N309) Freq: 1.99996
14e leal RCX, [R11 + R9]
152 addl R8, RCX # int
155 incl R9 # int
nop # 8 bytes pad for loops and calls
160 cmpl R9, #4234
167 jl,s B10 # loop end P=0.500000 C=20781.000000
169 B11: # out( N273 ) <- in( B10 ) Freq: 0.999982
169 addl RAX, R8 # int
16c addl RAX, R10 # int
16f addq rsp, 64 # Destroy frame
popq rbp
cmpq rsp, poll_offset[r15_thread]
ja #safepoint_stub # Safepoint: poll for GC
181 ret
182 B12: # out( B6 ) <- in( B1 ) Freq: 0.1
182 movl R10, R8 # spill
185 movl RAX, RCX # spill
187 jmp B6
18c B13: # out( B15 B14 ) <- in( B2 ) Freq: 9.00149e-05
18c movq RSI, precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # ptr
196 movq RBP, [rsp + #0] # spill
nop # 1 bytes pad for loops and calls
19b call,static wrapper for: _new_instance_Java
# compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop @ bci:18 (line 949) L[0]=RBP L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12 L[4]=#ScObj0 L[5]=#0 L[6]=_
# ScObj0 compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point={ [x :0]=rsp + #8, [y :1]=rsp + #12 }
# compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop_C2 @ bci:4 (line 962) L[0]=RBP L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12
# OopMap {rbp=Oop [0]=Oop off=416/0x1a0}
1a8 B14: # out( B4 ) <- in( B13 ) Freq: 9.00131e-05
# Block is sole successor of call
1a8 jmp B4
1ad B15: # out( B17 ) <- in( B13 ) Freq: 9.00149e-10
1ad # exception oop is in rax; no code emitted
1ad movq RSI, RAX # spill
1b0 jmp,s B17
1b2 B16: # out( B17 ) <- in( B4 ) Freq: 9e-06
1b2 # exception oop is in rax; no code emitted
1b2 movq RSI, RAX # spill
1b5 B17: # out( N273 ) <- in( B16 B15 ) Freq: 9.0009e-06
1b5 addq rsp, 64 # Destroy frame
popq rbp
1ba jmp rethrow_stub
--------------------------------------------------------------------------------
STDERR:
Command Line:
/scratch/chagedor/jdk/open/jdk-22/fastdebug/bin/java -DReproduce=true -cp /scratch/chagedor/jdk/open/JTwork/classes/compiler/c2/irTests/scalarReplacement/AllocationMergesTests.d:/scratch/chagedor/jdk/open/test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement:/scratch/chagedor/jdk/open/JTwork/classes/test/lib:/scratch/chagedor/jdk/open/JTwork/classes:/home/chagedor/jtreg/lib/javatest.jar:/home/chagedor/jtreg/lib/jtreg.jar:/home/chagedor/jtreg/lib/junit-platform-console-standalone-1.9.2.jar:/home/chagedor/jtreg/lib/testng-7.3.0.jar:/home/chagedor/jtreg/lib/jcommander-1.78.jar:/home/chagedor/jtreg/lib/guice-4.2.3.jar -Djava.library.path=. -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -DWarmup=2000 -XX:+CreateCoredumpOnCrash -ea -esa -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation -Dir.framework.server.port=37699 -XX:+UnlockDiagnosticVMOptions -XX:+ReduceAllocationMerges -XX:+TraceReduceAllocationMerges -XX:+DeoptimizeALot -XX:CompileCommand=exclude,*::dummy* -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:CompilerDirectivesFile=test-vm-compile-commands-pid-13910.log -XX:CompilerDirectivesLimit=421 -XX:-OmitStackTraceInFastThrow -DShouldDoIRVerification=true -XX:-BackgroundCompilation -XX:CompileCommand=quiet compiler.lib.ir_framework.test.TestVM compiler.c2.irTests.scalarReplacement.AllocationMergesTests
One or more @IR rules failed:
Failed IR Rules (1) of Methods (1)
----------------------------------
1) Method "int compiler.c2.irTests.scalarReplacement.AllocationMergesTests.testNoEscapeWithLoadInLoop_C2(boolean,int,int)" - [Failed IR rules: 1]:
* @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={DEFAULT}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={}, applyIfAnd={}, failOn={"_#ALLOC#_"}, applyIfOr={}, applyIfNot={})"
> Phase "PrintOptoAssembly":
- failOn: Graph contains forbidden nodes:
* Constraint 1: "(.*precise .*\R((.*(?i:mov|mv|xorl|nop|spill).*|\s*)\R)*.*(?i:call,static).*wrapper for: _new_instance_Java)"
- Matched forbidden node:
* 18c movq RSI, precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # ptr
196 movq RBP, [rsp + #0] # spill
nop # 1 bytes pad for loops and calls
19b call,static wrapper for: _new_instance_Java