JDK-8308766 : TLAB initialization may cause div by zero
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8,11,17,20,21
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2023-05-24
  • Updated: 2024-12-12
  • Resolved: 2023-06-01
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 17 JDK 21
17.0.15-oracleFixed 21 b26Fixed
Related Reports
Relates :  
Relates :  
Description
TLAB initialization (in ThreadLocalAllocBuffer::initialize()) samples how much of capacity has actually been filled for statistics/resizing purposes:

  size_t capacity = Universe::heap()->tlab_capacity(thread()) / HeapWordSize;
  
  // Keep alloc_frac as float and not double to avoid the double to float conversion
  float alloc_frac = desired_size() * target_refills() / (float) capacity;

That capacity can be zero (if e.g. there is no space left for allocation). 

Adding an assert checking that capacity is > 0 here will crash the VM during shutdown (experienced only on OSX for whatever reason):

Stack: [0x000000016d454000,0x000000016d657000],  sp=0x000000016d656d40,  free space=2059k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x1374050]  VMError::report_and_die(int, char const*, char const*, char*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x4bc  (threadLocalAllocBuffer.cpp:216)
V  [libjvm.dylib+0x13749ec]  VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, char*)+0x40
V  [libjvm.dylib+0x6aa82c]  report_vm_error(char const*, int, char const*, char const*, ...)+0x6c
V  [libjvm.dylib+0x12a8754]  ThreadLocalAllocBuffer::initialize()+0x7c
V  [libjvm.dylib+0xb11194]  attach_current_thread(JavaVM_*, void**, void*, bool)+0x1d0
V  [libjvm.dylib+0xb10e0c]  jni_DestroyJavaVM+0x4c
C  [libjli.dylib+0x745c]  JavaMain+0xc3c
C  [libjli.dylib+0x94cc]  ThreadJavaMain+0xc
C  [libsystem_pthread.dylib+0x7240]  _pthread_start+0x94

This seems timing dependent, only on OSX and G1 we ever noticed an issue (probably JDK-8264798), the FPE_FLTDIV delivered lazily/late. 

The suggested fix is to just not sample in this case.
Comments
A pull request was submitted for review. URL: https://git.openjdk.org/jdk17u-dev/pull/1536 Date: 2023-07-04 12:00:33 +0000
04-07-2023

Fix Request (17u) Fixes a corner case in G1 that can result in rare SIGFPE. Applies cleanly.
04-07-2023

Changeset: 96ed1392 Author: Thomas Schatzl <tschatzl@openjdk.org> Date: 2023-06-01 06:57:45 +0000 URL: https://git.openjdk.org/jdk/commit/96ed1392d1c5062063b1f8b5f1bd30d2d17ce3fe
01-06-2023

Added affects version back to JDK 8 since that code and the `tlab_capacity()` implementation are the same as they are now. Maybe other circumstance prevent this from happening though.
25-05-2023

A pull request was submitted for review. URL: https://git.openjdk.org/jdk/pull/14121 Date: 2023-05-24 11:50:02 +0000
24-05-2023