JDK-8332506 : SIGFPE In ObjectSynchronizer::is_async_deflation_needed()
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 17,21
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2024-05-15
  • Updated: 2025-05-13
  • Resolved: 2025-01-08
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 21 JDK 23 JDK 24 JDK 25
21.0.8Fixed 23Resolved 24Fixed 25 b05Fixed
Related Reports
Duplicate :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
ADDITIONAL SYSTEM INFORMATION :
Host: Intel(R) Xeon(R) Silver 4215R CPU @ 3.20GHz, 4 cores, 15G, Red Hat Enterprise Linux release 8.8 (Ootpa)
JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) (21.0.1+12) (build 21.0.1+12-LTS)
Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)

Note this a VMWare VM

A DESCRIPTION OF THE PROBLEM :
The JVM seemingly randomly crashed with a SIGFPE Error

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGFPE (0x8) at pc=0x00007fd1ebb1f331, pid=802111, tid=802127
#
# JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) (21.0.1+12) (build 21.0.1+12-LTS)
# Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xf46331]  ObjectSynchronizer::is_async_deflation_needed()+0x1e1




Comments
[jdk21u-fix-request] Approval Request from Roman Marchenko Almost clean backport to 21u. PR checks (tier1) are OK. This fixes division by zero problem in JDK21 as there are cases the problem occurs in 21.
02-04-2025

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk21u-dev/pull/1566 Date: 2025-04-02 10:33:08 +0000
02-04-2025

There is a customer who has the issue in 21.0.6: ``` # SIGFPE (0x8) at pc=0x00007fa729e8b551, pid=1031338, tid=1031352 # Problematic frame: # V [libjvm.so+0xe8b551] ObjectSynchronizer::is_async_deflation_needed()+0x1e1 Stack: [0x00007fa6f2af0000,0x00007fa6f2bf0000], sp=0x00007fa6f2beec80, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xe8b551] ObjectSynchronizer::is_async_deflation_needed()+0x1e1 V [libjvm.so+0xc0da98] MonitorDeflationThread::monitor_deflation_thread_entry(JavaThread*, JavaThread*)+0xc8 V [libjvm.so+0x91cfd0] JavaThread::thread_main_inner()+0x1e0 V [libjvm.so+0xedba38] Thread::call_run()+0xa8 V [libjvm.so+0xc66fea] thread_native_entry(Thread*)+0xda siginfo: si_signo: 8 (SIGFPE), si_code: 1 (FPE_INTDIV), si_addr: 0x00007fa729e8b551
02-04-2025

A pull request was submitted for review. Branch: jdk24 URL: https://git.openjdk.org/jdk/pull/23000 Date: 2025-01-09 10:45:32 +0000
09-01-2025

Changeset: cbabc045 Branch: master Author: Fredrik Bredberg <fbredberg@openjdk.org> Date: 2025-01-08 09:50:35 +0000 URL: https://git.openjdk.org/jdk/commit/cbabc0451505a00dfe77c163190736460c53820f
08-01-2025

The ceiling calculation was last modified by JDK-8226416 AFAICS, though the potential for division-by-zero seems to pre-date that.
19-12-2024

A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/22815 Date: 2024-12-18 15:33:43 +0000
18-12-2024

Thanks [~dholmes]! As I understand it, the reason the ceiling value currently increases even if monitor_usage is below MonitorUsedDeflationThreshold, is because it was meant to act as a back off plan when there was too many deflations without any progress, and the thought was that it would lower any repeated async deflation request pressure. If you ask me, the design flaw is that the ceiling value should not increase if monitor_usage is below the MonitorUsedDeflationThreshold.
16-12-2024

Great find [~fbredberg]! I don't think it was ever expected that the ceiling would increase without bound! A simple fix would just prevent the overflow but there seems a design flaw here as we should not keep increasing this way.
16-12-2024

Using objdump to disassemble the code for ObjectSynchronizer::is_async_deflation_needed() on x86 we find that there is only one "div" instruction. By using addr2line we find the corresponding source line near the end of the monitors_used_above_threshold() function, which seems to have been inlined into is_async_deflation_needed(). The problematic line looks like this: size_t monitor_usage = (monitors_used * 100LL) / ceiling; The problem seems to be that the value of "ceiling" has somehow become zero, which is causing the division by zero error. A few lines above that code we see how a new increased ceiling value is created if there has been too many deflations without progress. This will eventually lead to an overflow in the ceiling value, and if we're really unlucky, it will become zero. This is further confirmed when looking at the last log line in the description of JDK-8343619 (Crash similar to JDK-8332506), which looks like this: [2024-10-30T16:41:33.915+0000][info][monitorinflation] Too many deflations without progress; bumping in_use_list_ceiling from 17587870461276979200 to 0
04-12-2024

Additional Information from submitter: ================================================= I've run my application with 21.0.4 for for over 72 hours under the same circumstances as before, except dumping threads every 30s instead of a minute to encourage a reoccurrence. I am happy to report that I haven't experienced a crash. I'm assuming some combination of the below issues fixed the root cause of the crash, but, nothing definitive. JDK-8318757 - VM_ThreadDump asserts in interleaved ObjectMonitor::deflate_monitor calls JDK-8273107 - RunThese24H times out with "java.lang.management.ThreadInfo.getLockName()" is null JDK-8320515 - assert(monitor->object_peek() != nullptr) failed: Owned monitors should not have a dead object
29-07-2024

Additional Information from Submitter: ============================ When reviewing JDK-8290786 (previous bug of this incident), the crash report in that ticket also includes "ThreadDump" entries under 'VM Operations'.
03-06-2024

ILW = HLM = P3
28-05-2024

I still can't see anything that would lead to a SIGFPE. But interesting that the thread dump triggers it.
24-05-2024

Additional Information from submitter to original bug report (JDK-8332506) ===================================================== I reported this happens randomly. I have since discovered this crash always occurs with a thread dump. The application was periodically (once a minute) performing a thread dump. The application would crash after ~30-38 hours of run time. Looking around for bugs on the JDK, I'm assuming this happens due to JDK-8318757, but have no concrete proof. In the initial report this happened on 21.0.1, I have since had it on 21.0.3. In both reports the application is using ZGC (single-gen) REGRESSION : Last worked in version 17 STEPS TO FOLLOW TO REPRODUCE THE PROBLEM : Run the supplied code for up to 48 hours, at once a minute. Potentially increasing your odds for reproduction if ran at a higher rate. EXPECTED VERSUS ACTUAL BEHAVIOR : EXPECTED - No JVM Crash ACTUAL - JVM Crashed with SIGFPE. ---------- BEGIN SOURCE ---------- import java.io.ByteArrayOutputStream; import java.io.FileInputStream; import java.io.PrintStream; import java.lang.management.LockInfo; import java.lang.management.ManagementFactory; import java.lang.management.MonitorInfo; import java.lang.management.ThreadInfo; import java.lang.management.ThreadMXBean; import java.util.Arrays; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.stream.Collectors; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.apache.logging.log4j.core.config.ConfigurationSource; import org.apache.logging.log4j.core.config.Configurator; import globalstar.logging.Log; public class ThreadChecker implements Runnable{ private final Logger logger; public ThreadChecker() { logger = LogManager.getLogger("thread-logger"); } private void dumpJVMThreads(PrintStream pw) { ThreadMXBean b = ManagementFactory.getThreadMXBean(); if (b != null) { ThreadInfo[] tis = b.dumpAllThreads(b.isObjectMonitorUsageSupported(), b.isSynchronizerUsageSupported()); if (tis != null && tis.length != 0) { pw.println("\nWaiting JVM threads:"); for (ThreadInfo ti : tis) { Thread.State state = ti.getThreadState(); if (state != Thread.State.RUNNABLE && state != Thread.State.TERMINATED) { printThreadInfo(pw, ti); } } }else { pw.println("\nNo Waiting JVM Threads"); } }else { pw.println("\nUnable to get ThreadMXBean"); } } private boolean findJVMDeadlocks(PrintStream pw) {; ThreadMXBean b = ManagementFactory.getThreadMXBean(); if (b != null) { long[] ids = b.findDeadlockedThreads(); if (ids != null && ids.length != 0) { ThreadInfo[] tis = b.getThreadInfo(ids, b.isObjectMonitorUsageSupported(), b.isSynchronizerUsageSupported()); pw.print("\nDeadlocked Java threads found:\n\t"); List<String> threadNames = Arrays.stream(tis).map(ThreadInfo::getThreadName).collect(Collectors.toList()); pw.println(String.join(", ", threadNames)); return true; }else { pw.println("\nNo Deadlocked JVM Threads"); } }else { pw.println("\nUnable to get ThreadMXBean"); } return false; } private void printThreadInfo(PrintStream pw, ThreadInfo ti) { pw.println("\tThread \"" + ti.getThreadName() + "\" (" + hex(ti.getThreadId()) + ") " + ti.getThreadState()); LockInfo l = ti.getLockInfo(); if (l != null) { pw.println("\t\twaiting for " + format(l) + (ti.getLockOwnerName() == null ? "" : " held by " + ti.getLockOwnerName() + " (" + hex(ti.getLockOwnerId()) + ")")); } Map<StackTraceElement, MonitorInfo> mlocs = new HashMap<StackTraceElement, MonitorInfo>(); MonitorInfo[] mis = ti.getLockedMonitors(); if (mis.length > 0) { pw.println("\tMonitors held:"); for (MonitorInfo mi : mis) { mlocs.put(mi.getLockedStackFrame(), mi); pw.println("\t\t" + format(mi)); } } LockInfo[] lis = ti.getLockedSynchronizers(); if (lis.length > 0) { pw.println("\tSynchronizers held:"); for (LockInfo li : lis) { pw.println("\t\t" + format(li)); } } pw.println("\tStack trace:"); StackTraceElement[] stes = ti.getStackTrace(); for (StackTraceElement ste : stes) { pw.print("\t\t" + ste.getClassName() + "." + ste.getMethodName() + formatLineNumber(":", ste.getLineNumber())); if (mlocs.containsKey(ste)) { pw.print(" -> locked " + format(mlocs.get(ste))); } pw.println(); } pw.println(); } private String formatLineNumber(String prefix, int n) { if (n < 0) { return ""; } else { return prefix + String.valueOf(n); } } private String format(LockInfo l) { if (l != null) { return l.getClassName() + " (" + hex(l.getIdentityHashCode()) + ")"; } else { return "<unknown>"; } } private String hex(long x) { return String.format("0x%08x", x); } @Override public void run() { try { try (ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); PrintStream printStream = new PrintStream(byteArrayOutputStream);) { findJVMDeadlocks(printStream); dumpJVMThreads(printStream); String threadReport = byteArrayOutputStream.toString(); logger.info("\n=====Thread Report=====\n" +threadReport+ "\n=======End Report======\n"); } catch (Exception e) { logger.error("Error producing thread report",e); } }catch(Exception ex) { Log.warning("Error in thread checker",ex); } } } ---------- END SOURCE ----------
24-05-2024

Additional Information from submitter: ============================== Since filing this report, I've noticed that this crash coincides with a thread dump. The application in question performs a thread dump once a minute, as we attempted to use it to uncover a deadlock, The crash would happen after ~30-36 hours after runtime. Since disabling these periodic thread dumps, no crash has occurred. I'm assuming it has to do with a combination of JDK-8318757 and JDK-8273107 since this issue did not occur on 17.0.8.
24-05-2024

The comments from JDK-8290786 remain the same - there is nowhere in our code that I can see that could introduce a floating-point division by zero, or an integer division by zero.
23-05-2024