JDK-8200288 : Compilation fails with "assert(!(is_cti(prev) && is_cti(insn))) failed: CTI-CTI not allowed" on SPARC
  • Type: Bug
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 10,11
  • Priority: P2
  • Status: Closed
  • Resolution: Fixed
  • CPU: sparc
  • Submitted: 2018-03-27
  • Updated: 2019-09-17
  • Resolved: 2018-05-30
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11
11 b17Fixed
Related Reports
Relates :  
Relates :  
Description
Running java -Xcomp -version on a SPARC64 VII+ crashes with:

#  Internal Error (/opt/mach5/mesos/work_dir/dad84d60-43ed-43c1-9bcf-edc21fbc9d77/workspace/open/src/hotspot/cpu/sparc/assembler_sparc.cpp:52), pid=14181, tid=15
#  assert(!(is_cti(prev) && is_cti(insn))) failed: CTI-CTI not allowed.

Current CompileTask:
C1:    322    1    b  3       java.lang.invoke.MethodHandle::<clinit> (30 bytes)

Stack: [0xffffffff77600000,0xffffffff77700000],  sp=0xffffffff776fe290,  free space=1016k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1de4ae0]  void VMError::report_and_die(int,const char*,const char*,void*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0x940
V  [libjvm.so+0x1de412c]  void VMError::report_and_die(Thread*,const char*,int,const char*,const char*,void*)+0x3c
V  [libjvm.so+0xd6d248]  void report_vm_error(const char*,int,const char*,const char*,...)+0x78
V  [libjvm.so+0x8b6a10]  void Assembler::validate_no_pipeline_hazards()+0x340
V  [libjvm.so+0x93fe20]  void Compilation::emit_code_epilog(LIR_Assembler*)+0x110
V  [libjvm.so+0x940088]  int Compilation::emit_code_body()+0x158
V  [libjvm.so+0x940848]  int Compilation::compile_java_method()+0x6d8
V  [libjvm.so+0x940d84]  void Compilation::compile_method()+0x284
V  [libjvm.so+0x9420f0]  Compilation::Compilation #Nvariant 1(AbstractCompiler*,ciEnv*,ciMethod*,int,BufferBlob*,DirectiveSet*)+0x390
V  [libjvm.so+0x94672c]  void Compiler::compile_method(ciEnv*,ciMethod*,int,DirectiveSet*)+0x22c
V  [libjvm.so+0xcde138]  void CompileBroker::invoke_compiler_on_method(CompileTask*)+0x708
V  [libjvm.so+0xcdcf0c]  void CompileBroker::compiler_thread_loop()+0x2ec
V  [libjvm.so+0x1cfe1b4]  void JavaThread::thread_main_inner()+0x2e4
V  [libjvm.so+0x1cfdea4]  void JavaThread::run()+0x374
V  [libjvm.so+0x19b7854]  thread_native_entry+0x2e4

Comments
Then I would suggest suppressing the simplistic pipeline validation on C1 (which does emit some data into the instruction stream) but keeping it on C2 (that emits only limited amounts of data into the instruction stream). However, a simple linear code sequence validation scheme like this will never be 100% accurate since it does not know or account for non-instructions in the instruction stream. -----8<----- --- a/src/hotspot/cpu/sparc/assembler_sparc.hpp +++ b/src/hotspot/cpu/sparc/assembler_sparc.hpp @@ -783,8 +783,10 @@ void flush() { #ifdef VALIDATE_PIPELINE assert(_delay_state == NoDelay, "Ending code with a delay-slot."); +#ifdef COMPILER2 validate_no_pipeline_hazards(); #endif +#endif AbstractAssembler::flush(); }
25-05-2018

It's a perhaps too strict assert and too na��ve, in debug builds, which doesn't test for annulled instructions nor uses/has any knowledge about actual control-flow. On the other hand I cannot see that we make any obscure use of the 'annul' property (the general no-CTI in a delay slot is covered by the "delay-state-tracker"). I should perhaps simply remove the assert.
23-04-2018

And it looks like there could be other places where assert could mistake sequence for CTI-CTI.
20-04-2018

Interesting. So assert mistakenly treat this sequence as CTI. I am more concern about emitting non 0 value than removing the CTI assert which checks only performance regression and not correctness.
20-04-2018

Should we consider this a bit too much of a hack... or just about acceptable? -----8<----- diff --git a/src/hotspot/cpu/sparc/c1_CodeStubs_sparc.cpp b/src/hotspot/cpu/sparc/c1_CodeStubs_sparc.cpp --- a/src/hotspot/cpu/sparc/c1_CodeStubs_sparc.cpp +++ b/src/hotspot/cpu/sparc/c1_CodeStubs_sparc.cpp @@ -370,8 +370,19 @@ // emit the offsets needed to find the code to patch int being_initialized_entry_offset = __ offset() - being_initialized_entry + sizeof_patch_record; - // Emit the patch record. We need to emit a full word, so emit an extra empty byte - __ emit_int8(0); + // Emit the patch record. We need to emit a full word, so we emit an extra + // byte (which is the most significant byte in what is effectively an insn + // in the current instruction stream). For this reason, we will generate a + // byte with the two most significant bits set, thus making sure that the + // "instruction" generated will never be (mistakenly) interpreted as a + // branch or call instruction (i.e. CTI). + // + // op=00 : SETHI, Branches, ILLTRAP + // op=01 : CALL + // op=10 : Arithmetic, Logical, Moves, Tcc, Loads, Stores, Prefetch, Misc + // op=11 : Arithmetic, Logical, Moves, Tcc, Loads, Stores, Prefetch, Misc + + __ emit_int8(0xc0); // op=0b11 to make sure "instruction" is not a CTI. __ emit_int8(being_initialized_entry_offset); __ emit_int8(bytes_to_skip); __ emit_int8(_bytes_to_copy);
20-04-2018

I believe them old T1 SPARC and SPARC64 machines will have a different prefetch pattern and we are actually missing any such machines in the regular testing. -----8<----- 0xffffffff680fcd1c: bn,pn %icc, 0xffffffff68104d9c 0xffffffff680fcd20: call 0xffffffff6818f220 UPDATE: I just had to take a closer look (since the above is actually not a proper (i)prefetch insn, in which case 'pn' should read 'pt'). It turns out that this is not an instruction at all, but a patch record. In fact, we don't emit any instruction prefetching at all, not even on the early Niagara chips.
20-04-2018

In short, I am fine if we skip CTI-CTI checks for C1 generated code.
17-04-2018

It complains about C1 generated code: C1: 322 1 b 3 java.lang.invoke.MethodHandle::<clinit> (30 bytes) We never optimized C1 for CTI because it is intermediate code which will be replaced by C2 generated code. Even JDK-8144448 changes touched only sparc.ad file which is C2's code. But I surprise that we did not hit this problem on Solaris SPARC. What is difference? There is also issue of testing on T1 where some instructions are not available and code shape could be different. It would be nice to see actual instructions in code buffer for which assert was fired.
17-04-2018

Here's the bug report for the illegal load factor: https://bugs.openjdk.java.net/browse/JDK-8201616 I was unfortunately unable to bisect the problem though.
17-04-2018

Yes, the "Illegal load factor" issue should be a separate one.
16-04-2018

I am seeing this problem on linux-sparc as well. @Tobias: How do I disable HMM/P2 so I can test this? Currently, on an UltraSparc T1 on Linux, I am getting: glaubitz@stadler:/srv/openjdk/hs$ ./build/linux-sparcv9-normal-server-fastdebug/jdk/bin/java --version # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/assembler_sparc.cpp:52 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/srv/openjdk/hs/src/hotspot/cpu/sparc/assembler_sparc.cpp:52), pid=11095, tid=11120 # assert(!(is_cti(prev) && is_cti(insn))) failed: CTI-CTI not allowed. # # JRE version: OpenJDK Runtime Environment (11.0) (fastdebug build 11-internal+0-adhoc.glaubitz.hs) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 11-internal+0-adhoc.glaubitz.hs, mixed mode, tiered, compressed oops, g1 gc, linux-sparc) # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /srv/openjdk/hs/hs_err_pid11095.log # # Compiler replay data is saved as: # /srv/openjdk/hs/replay_pid11095.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # Current thread is 11120 Dumping core ... Aborted glaubitz@stadler:/srv/openjdk/hs$ On the SPARC-T5, I get: glaubitz@deb4g:/srv/glaubitz/hs$ ./build/linux-sparcv9-normal-server-fastdebug/jdk/bin/java --version Error occurred during initialization of boot layer java.lang.IllegalArgumentException: Illegal load factor: -1.2197928E-12 glaubitz@deb4g:/srv/glaubitz/hs$
16-04-2018

Ok! I guess I should file a separate bug report for the other crash on the linux-sparc?
16-04-2018

There is also another problem on SPARC (see JDK-8200290). I'll take care of that.
27-03-2018

Patric, you've introduced this assert with JDK-8144448. Could you please have a look?
27-03-2018

ILW = Assert during compilation (regression in JDK 10), on old SPARC machines but easy to reproduce, disable compilation = HMM = P2
27-03-2018