JDK-8261552 : s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 17
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • CPU: s390x
  • Submitted: 2021-02-11
  • Updated: 2021-03-08
  • Resolved: 2021-03-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17
17 b12Fixed
Related Reports
Relates :  
Description
This is an old bug in the s390 template interpreter which has been there since the day the s390 port was contributed; it has been dormant however and was activated just recently by JDK-8250989 which changed the way class space reservation happens at CDS dump time. 

It surfaced first as crash in runtime/cds/appcds/cacheObject/DifferentHeapSizes.java .

-----

If Compressed class pointer base has a non-zero value it may cause MacroAssembler::encode_klass_not_null() to encode a Klass pointer to a wrong narrow pointer.

This can be reproduced by starting the VM with
```
-Xshare:dump -XX:HeapBaseMinAddress=2g -Xmx128m
```
but CDS is not involved. It is only relevant insofar as this is the only way to get the following combination:
- heap is allocated at 0x800_0000. It is small and ends at 0x8800_0000.
- class space follows at 0x8800_0000
- the narrow klass pointer base points to the start of the class space at 0x8800_0000.

It is not possible with an unmodified VM to get the same combination of values with Xshare=on or Xshare=off.

-----

In MacroAssembler::encode_klass_not_null(), there is the following section:

```
  if (base != NULL) {
    unsigned int base_h = ((unsigned long)base)>>32;
    unsigned int base_l = (unsigned int)((unsigned long)base);
    if ((base_h != 0) && (base_l == 0) && VM_Version::has_HighWordInstr()) {
      lgr_if_needed(dst, current);
      z_aih(dst, -((int)base_h));     // Base has no set bits in lower half.
A  } else if ((base_h == 0) && (base_l != 0)) {  
      lgr_if_needed(dst, current);                
B    z_agfi(dst, -(int)base_l);                   
    } else {
      load_const(Z_R0, base);
      lgr_if_needed(dst, current);
      z_sgr(dst, Z_R0);
    }
    current = dst;
  }
```

We enter the condition at (A) if the narrow klass pointer base is non-zero but fits into 32bit. At (B), we want to substract the base from the Klass pointer; we do this by calculating the 32bit twos-complement of the base and add it with AGFI. AGFI adds a 32bit immediate to a 64bit register. In this case, it produces the wrong result if the base is >0x800_0000:

In the case of the crash, we have:
base: 
 8800_0000
32bit two's complement of base:  
 7800_0000
klass pointer: 			 
 8804_1040
klass pointer+7800_0000:
 1_0004_1040

So the result of the "substraction" is 1_0004_1040, it should be 4_1040, which would be the correct offset of the Klass* pointer within the ccs. This wrong Klass* pointer then gets encoded and written into the Object header. VM crashes the first time the Klass pointer is decoded (which is pretty fast).


Comments
Changeset: fdd10932 Author: Thomas Stuefe <stuefe@openjdk.org> Date: 2021-03-02 04:30:26 +0000 URL: https://git.openjdk.java.net/jdk/commit/fdd10932
02-03-2021

When debugging the *write* of the broken narrow klass pointer, I see this: {{noformat}} A 0x000003fff12c3960: agfi %r3,2013265920 0x000003fff12c3966: srlg %r3,%r3,3 B 0x000003fff12c396c: st %r3,8(%r2) End of assembler dump. (gdb) p/x $r3 $5 = 0x88041040 (gdb) stepi 0x000003fff12c3966 in ?? () (gdb) p/x $r3 $6 = 0x100041040 {{noformat}} At (A), we subtract the narrow class pointer base (0x88000000) from the still correct Klass* pointer (0x88041040). The result is 0x100041040. This looks like agfi does 64bit extension where it should not. Should we use afi instead?
16-02-2021

CDS involvement incidental. We crash when creating the initial thread object. We crash in the template interpreter, when interpreting "return_initialize_finalizer" bytecode: {noformat} Program received signal SIGSEGV, Segmentation fault. 0x000003fff12cc6d8 in ?? () (gdb) disassemble 0x000003fff12cc6b0,0x000003fff12cc6e0 Dump of assembler code from 0x000003fff12cc6b0 to 0x3fff12cc6e0: 0x000003fff12cc6b0: aghi %r7,-8 A 0x000003fff12cc6b4: lg %r3,0(%r12) # TemplateTable::_return (return_register_finalizer) 0x000003fff12cc6ba: llgf %r6,8(%r3) # MA::load_klass B 0x000003fff12cc6c0: sllg %r6,%r6,3 # MA::decode_klass_not_null B 0x000003fff12cc6c6: algfi %r6,2281701376 # MA::decode_klass_not_null C 0x000003fff12cc6cc: tmll %r6,7 # MA::decode_klass_not_null 0x000003fff12cc6d0: je 0x3fff12cc6d8 # MA::decode_klass_not_null 0x000003fff12cc6d4: .long 0x00d100d1 => 0x000003fff12cc6d8: tm 172(%r6),64 # TemplateTable::_return (return_register_finalizer) 0x000003fff12cc6dc: je 0x3fff12cc870 (gdb) info registers r3 0x87f07da0 r6 0x188041040 r12 0x3fffbc7df28 pc 0x3fff12cc6d8 cc 0x0 0 {noformat} The heap goes from 0x80000000..0x88000000. CCS starts at 0x88000000. The crash address is 0x1_8804_1040. This is unmapped memory. However, the lower 32bit would point into committed section of CCS, at the start of a valid Klass*. This pattern is consistent if I change CCS location by specifying a different HeapBaseMinAddress. So it seems the decoded Klass pointer 0x1_8804_1000 is wrong, has garbage in the upper 32bit. The disassembly matches against the s390 version of {{MacroAssembler::decode_klass_not_null}}: https://github.com/openjdk/jdk/blob/34ae7aeb64db90dcb4d2f3d4acb16c714a32824f/src/hotspot/cpu/s390/macroAssembler_s390.cpp#L3693-L3704 followed by an access (tm) where the code tries to dereference the klass pointer to check the klass accessflag bit for "has_finalizer": https://github.com/openjdk/jdk/blob/34ae7aeb64db90dcb4d2f3d4acb16c714a32824f/src/hotspot/cpu/s390/templateTable_s390.cpp#L2314 The decoded klass pointer is in r6. r3 still contains the object address the pointer has been loaded from. The address is a valid heap address. Mark word looks valid. Header at this location: {{noformat}} (gdb) x/8x $r3 0x87f07da0: 0x00000000 0x00000001 0x20008208 0x00000000 0x87f07db0: 0xbaadbabe 0xbaadbabe 0xbaadbabe 0xbaadbabe {{noformat}} So the encoded klass pointer in the object header is 0x20008208. Manually decoding this gives me the same wrong decoded Klass pointer value of 0x1_8804_1040. So, the decoding code seems correct, and the encoded narrow Klass pointer is already wrong. The correct value would have been just 0x8208.
16-02-2021

The CDS involvement is incidental. The crash happens when: - class space is allocated in the lower 32bit - we dont load the initial objects from CDS archive - the CompressedClassPointer base is not NULL normally, all this is only true if Xshare=dump. With Xshare=off, we set the CompressedClassPointer base to zero and the bug disappears. If I modify CompressedClassPointer::initialize to always give us a non-zero base, regardless of CDS, then the error also happens with Xshare=off. This also explains why JDK-8250989 triggers this bug: since the CDS initialization changes, class space is allocated independently from CDS when Xshare=dump. Before, it was allocated as part of the shared cds+ccs allocation.
16-02-2021

Maybe it has something to do with the changes in metaspace.cpp? I don't have access to s390 so I can't test it myself. The crash happens very early during VM bootstrap. Very little CDS initialization were done (except for MetaspaceShared::initialize_for_static_dump()). Maybe the crash is unrelated to CDS? Can you try running: java -Xshare:off -XX:HeapBaseMinAddress=0x80000000 -Xmx128m -version
11-02-2021