Bug ID: JDK-8332506 SIGFPE In ObjectSynchronizer::is_async_deflation

JDK 21	JDK 23	JDK 24	JDK 25
21.0.8Fixed	23Resolved	24Fixed	25 b05Fixed

[jdk21u-fix-request] Approval Request from Roman Marchenko Almost clean backport to 21u. PR checks (tier1) are OK. This fixes division by zero problem in JDK21 as there are cases the problem occurs in 21.
02-04-2025
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk21u-dev/pull/1566 Date: 2025-04-02 10:33:08 +0000
02-04-2025
There is a customer who has the issue in 21.0.6: ``` # SIGFPE (0x8) at pc=0x00007fa729e8b551, pid=1031338, tid=1031352 # Problematic frame: # V [libjvm.so+0xe8b551] ObjectSynchronizer::is_async_deflation_needed()+0x1e1 Stack: [0x00007fa6f2af0000,0x00007fa6f2bf0000], sp=0x00007fa6f2beec80, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xe8b551] ObjectSynchronizer::is_async_deflation_needed()+0x1e1 V [libjvm.so+0xc0da98] MonitorDeflationThread::monitor_deflation_thread_entry(JavaThread, JavaThread)+0xc8 V [libjvm.so+0x91cfd0] JavaThread::thread_main_inner()+0x1e0 V [libjvm.so+0xedba38] Thread::call_run()+0xa8 V [libjvm.so+0xc66fea] thread_native_entry(Thread*)+0xda siginfo: si_signo: 8 (SIGFPE), si_code: 1 (FPE_INTDIV), si_addr: 0x00007fa729e8b551
02-04-2025
A pull request was submitted for review. Branch: jdk24 URL: https://git.openjdk.org/jdk/pull/23000 Date: 2025-01-09 10:45:32 +0000
09-01-2025
Changeset: cbabc045 Branch: master Author: Fredrik Bredberg <fbredberg@openjdk.org> Date: 2025-01-08 09:50:35 +0000 URL: https://git.openjdk.org/jdk/commit/cbabc0451505a00dfe77c163190736460c53820f
08-01-2025
The ceiling calculation was last modified by JDK-8226416 AFAICS, though the potential for division-by-zero seems to pre-date that.
19-12-2024
A pull request was submitted for review. Branch: master URL: https://git.openjdk.org/jdk/pull/22815 Date: 2024-12-18 15:33:43 +0000
18-12-2024
Thanks [~dholmes]! As I understand it, the reason the ceiling value currently increases even if monitor_usage is below MonitorUsedDeflationThreshold, is because it was meant to act as a back off plan when there was too many deflations without any progress, and the thought was that it would lower any repeated async deflation request pressure. If you ask me, the design flaw is that the ceiling value should not increase if monitor_usage is below the MonitorUsedDeflationThreshold.
16-12-2024
Great find [~fbredberg]! I don't think it was ever expected that the ceiling would increase without bound! A simple fix would just prevent the overflow but there seems a design flaw here as we should not keep increasing this way.
16-12-2024
Using objdump to disassemble the code for ObjectSynchronizer::is_async_deflation_needed() on x86 we find that there is only one "div" instruction. By using addr2line we find the corresponding source line near the end of the monitors_used_above_threshold() function, which seems to have been inlined into is_async_deflation_needed(). The problematic line looks like this: size_t monitor_usage = (monitors_used * 100LL) / ceiling; The problem seems to be that the value of "ceiling" has somehow become zero, which is causing the division by zero error. A few lines above that code we see how a new increased ceiling value is created if there has been too many deflations without progress. This will eventually lead to an overflow in the ceiling value, and if we're really unlucky, it will become zero. This is further confirmed when looking at the last log line in the description of JDK-8343619 (Crash similar to JDK-8332506), which looks like this: [2024-10-30T16:41:33.915+0000][info][monitorinflation] Too many deflations without progress; bumping in_use_list_ceiling from 17587870461276979200 to 0
04-12-2024
Additional Information from submitter: ================================================= I've run my application with 21.0.4 for for over 72 hours under the same circumstances as before, except dumping threads every 30s instead of a minute to encourage a reoccurrence. I am happy to report that I haven't experienced a crash. I'm assuming some combination of the below issues fixed the root cause of the crash, but, nothing definitive. JDK-8318757 - VM_ThreadDump asserts in interleaved ObjectMonitor::deflate_monitor calls JDK-8273107 - RunThese24H times out with "java.lang.management.ThreadInfo.getLockName()" is null JDK-8320515 - assert(monitor->object_peek() != nullptr) failed: Owned monitors should not have a dead object
29-07-2024
Additional Information from Submitter: ============================ When reviewing JDK-8290786 (previous bug of this incident), the crash report in that ticket also includes "ThreadDump" entries under 'VM Operations'.
03-06-2024
ILW = HLM = P3
28-05-2024
I still can't see anything that would lead to a SIGFPE. But interesting that the thread dump triggers it.
24-05-2024
Additional Information from submitter to original bug report (JDK-8332506) ===================================================== I reported this happens randomly. I have since discovered this crash always occurs with a thread dump. The application was periodically (once a minute) performing a thread dump. The application would crash after ~30-38 hours of run time. Looking around for bugs on the JDK, I'm assuming this happens due to JDK-8318757, but have no concrete proof. In the initial report this happened on 21.0.1, I have since had it on 21.0.3. In both reports the application is using ZGC (single-gen) REGRESSION : Last worked in version 17 STEPS TO FOLLOW TO REPRODUCE THE PROBLEM : Run the supplied code for up to 48 hours, at once a minute. Potentially increasing your odds for reproduction if ran at a higher rate. EXPECTED VERSUS ACTUAL BEHAVIOR : EXPECTED - No JVM Crash ACTUAL - JVM Crashed with SIGFPE. ---------- BEGIN SOURCE ---------- import java.io.ByteArrayOutputStream; import java.io.FileInputStream; import java.io.PrintStream; import java.lang.management.LockInfo; import java.lang.management.ManagementFactory; import java.lang.management.MonitorInfo; import java.lang.management.ThreadInfo; import java.lang.management.ThreadMXBean; import java.util.Arrays; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.stream.Collectors; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; import org.apache.logging.log4j.core.config.ConfigurationSource; import org.apache.logging.log4j.core.config.Configurator; import globalstar.logging.Log; public class ThreadChecker implements Runnable{ private final Logger logger; public ThreadChecker() { logger = LogManager.getLogger("thread-logger"); } private void dumpJVMThreads(PrintStream pw) { ThreadMXBean b = ManagementFactory.getThreadMXBean(); if (b != null) { ThreadInfo[] tis = b.dumpAllThreads(b.isObjectMonitorUsageSupported(), b.isSynchronizerUsageSupported()); if (tis != null && tis.length != 0) { pw.println("\nWaiting JVM threads:"); for (ThreadInfo ti : tis) { Thread.State state = ti.getThreadState(); if (state != Thread.State.RUNNABLE && state != Thread.State.TERMINATED) { printThreadInfo(pw, ti); } } }else { pw.println("\nNo Waiting JVM Threads"); } }else { pw.println("\nUnable to get ThreadMXBean"); } } private boolean findJVMDeadlocks(PrintStream pw) {; ThreadMXBean b = ManagementFactory.getThreadMXBean(); if (b != null) { long[] ids = b.findDeadlockedThreads(); if (ids != null && ids.length != 0) { ThreadInfo[] tis = b.getThreadInfo(ids, b.isObjectMonitorUsageSupported(), b.isSynchronizerUsageSupported()); pw.print("\nDeadlocked Java threads found:\n\t"); List<String> threadNames = Arrays.stream(tis).map(ThreadInfo::getThreadName).collect(Collectors.toList()); pw.println(String.join(", ", threadNames)); return true; }else { pw.println("\nNo Deadlocked JVM Threads"); } }else { pw.println("\nUnable to get ThreadMXBean"); } return false; } private void printThreadInfo(PrintStream pw, ThreadInfo ti) { pw.println("\tThread \"" + ti.getThreadName() + "\" (" + hex(ti.getThreadId()) + ") " + ti.getThreadState()); LockInfo l = ti.getLockInfo(); if (l != null) { pw.println("\t\twaiting for " + format(l) + (ti.getLockOwnerName() == null ? "" : " held by " + ti.getLockOwnerName() + " (" + hex(ti.getLockOwnerId()) + ")")); } Map<StackTraceElement, MonitorInfo> mlocs = new HashMap<StackTraceElement, MonitorInfo>(); MonitorInfo[] mis = ti.getLockedMonitors(); if (mis.length > 0) { pw.println("\tMonitors held:"); for (MonitorInfo mi : mis) { mlocs.put(mi.getLockedStackFrame(), mi); pw.println("\t\t" + format(mi)); } } LockInfo[] lis = ti.getLockedSynchronizers(); if (lis.length > 0) { pw.println("\tSynchronizers held:"); for (LockInfo li : lis) { pw.println("\t\t" + format(li)); } } pw.println("\tStack trace:"); StackTraceElement[] stes = ti.getStackTrace(); for (StackTraceElement ste : stes) { pw.print("\t\t" + ste.getClassName() + "." + ste.getMethodName() + formatLineNumber(":", ste.getLineNumber())); if (mlocs.containsKey(ste)) { pw.print(" -> locked " + format(mlocs.get(ste))); } pw.println(); } pw.println(); } private String formatLineNumber(String prefix, int n) { if (n < 0) { return ""; } else { return prefix + String.valueOf(n); } } private String format(LockInfo l) { if (l != null) { return l.getClassName() + " (" + hex(l.getIdentityHashCode()) + ")"; } else { return "<unknown>"; } } private String hex(long x) { return String.format("0x%08x", x); } @Override public void run() { try { try (ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); PrintStream printStream = new PrintStream(byteArrayOutputStream);) { findJVMDeadlocks(printStream); dumpJVMThreads(printStream); String threadReport = byteArrayOutputStream.toString(); logger.info("\n=====Thread Report=====\n" +threadReport+ "\n=======End Report======\n"); } catch (Exception e) { logger.error("Error producing thread report",e); } }catch(Exception ex) { Log.warning("Error in thread checker",ex); } } } ---------- END SOURCE ----------
24-05-2024
Additional Information from submitter: ============================== Since filing this report, I've noticed that this crash coincides with a thread dump. The application in question performs a thread dump once a minute, as we attempted to use it to uncover a deadlock, The crash would happen after ~30-36 hours after runtime. Since disabling these periodic thread dumps, no crash has occurred. I'm assuming it has to do with a combination of JDK-8318757 and JDK-8273107 since this issue did not occur on 17.0.8.
24-05-2024
The comments from JDK-8290786 remain the same - there is nowhere in our code that I can see that could introduce a floating-point division by zero, or an integer division by zero.
23-05-2024

Duplicate :	JDK-8343619 - Crash similar to JDK-8332506
Relates :	JDK-8273107 - RunThese24H times out with "java.lang.management.ThreadInfo.getLockName()" is null
Relates :	JDK-8355647 - Fix ceiling calculation in ObjectSynchronizer::is_async_deflation_needed()
Relates :	JDK-8226416 - MonitorUsedDeflationThreshold can cause repeated async deflation requests
Relates :	JDK-8318757 - VM_ThreadDump asserts in interleaved ObjectMonitor::deflate_monitor calls