JDK-8222838 : Shenandoah: SEGV on accessing cset bitmap for NULL ptr
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8-shenandoah,11-shenandoah,12,13
  • Priority: P2
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-04-23
  • Updated: 2019-07-31
  • Resolved: 2019-04-24
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 12 JDK 13
12.0.2Fixed 13 b18Fixed
Related Reports
Blocks :  
Relates :  
Description
This seems to happen more or less reliably with:

CONF=linux-x86_64-server-fastdebug make images run-test TEST=gc/stress/gcbasher/TestGCBasherWithShenandoah.java TEST_VM_OPTS="-XX:-UseCompressedOops"

The full hs_err for such a failure is attached. The brief analysis of the failure:

siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007f783c1ad3a0


 ;; B9: #	B970 B10 <- B8  Freq: 0.500054

  0x00007f783bc1950d: mov    0x18(%rax,%r12,8),%r12
  0x00007f783bc19512: testb  $0x1,0x20(%r15)
  0x00007f783bc19517: jne    0x00007f783bc1d422  <--- enter LRB midpath

...

 ;; B970: #	B10 B971 <- B9  Freq: 0.000500047

  0x00007f783bc1d422: mov    %r12,%r10    <--- LRB midpath starts
  0x00007f783bc1d425: shr    $0x13,%r10
  0x00007f783bc1d429: movabs $0x7f783c1ad3a0,%r11 <--- biased in-cset bitmap
  0x00007f783bc1d433: cmpb   $0x0,(%r11,%r10,1)  <--- in-cset check, SEGV here
  0x00007f783bc1d438: je     0x00007f783bc1951d

R10=0x0 is NULL
R11=0x00007f783c1ad3a0 is an unknown value

So, this is a SEGV in LRB midpath caused by trying to check the in-cset bit for NULL pointer. This means LRB path misses the null-check.

I think we missed this during initial LRB work, see the disassembly here:
  https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-March/008995.html

There are implicit null checks when accessing the object fields off the affected reference, but in-cset check is not covered by it. We are mostly lucky it always points to readable memory: in that case, whatever in-cset check replies, it would fail the next implicit NPE. That is, until we are unlucky, and in-cset(NULL) points to unmapped memory and SEGVs.
Comments
Fix Request (12u) This fixes resolves a corner case in Shenandoah that leads to JVM crash. Patch applies cleanly to 12u and passes hotspot_gc_shenandoah.
25-04-2019

The saner alternative is allocating cset bitmap so that we can accept checks for NULL ptrs. This would not require changes in compilers: http://cr.openjdk.java.net/~shade/8222838/webrev.00/
23-04-2019