JDK-8130425 : libjvm crash due to stack overflow in executables with 32k tbss/tdata
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 7
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2015-07-03
  • Updated: 2023-09-26
  • Resolved: 2016-02-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 7 JDK 8 JDK 9
7u111Fixed 8u102Fixed 9 b110Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Sub Tasks
JDK-8153057 :  
Description
Problem summary: When a large TLS (Thread local storage) size is set for threads, JVM is throwing stack overflow exception.

Problem Identified:
As per investigation and a discussion we came to the conclusion that issue is not with the JVM but it lies in the way glibc has been implemented. When a TLS is declared , it steals the space from threads stack size. So if a thread is created with small stack size, and TLS is setted to a large value, then it will result in StackOverflow. This is the exact case in this bug where reaper thread is allocated a very low stack size 32768.

Discussion thread: 
http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-December/037558.html

Solution proposed:
Its expected to get fix in glibc sometime , but for now I propose a workaround, a boolean system property "processReaperUseDefaultStackSize"
using which we can set the stack size for reaper thread to default instead of fix 32768. This property can be set by the user using "-D" or "System.setProperty()".
I have tested this fix, it works well with TLS size between 32k to 128k.

Fix:
diff -r 5c4530bb9ae6
src/java.base/share/classes/java/lang/ProcessHandleImpl.java
--- a/src/java.base/share/classes/java/lang/ProcessHandleImpl.java  Fri Jan 08 13:06:29 2016 +0800
+++ b/src/java.base/share/classes/java/lang/ProcessHandleImpl.java  Tue
Jan 12 15:55:50 2016 +0530
@@ -83,9 +83,13 @@
                  ThreadGroup systemThreadGroup = tg;

                  ThreadFactory threadFactory = grimReaper -> {
-                    // Our thread stack requirement is quite modest.
-                    Thread t = new Thread(systemThreadGroup, grimReaper,
-                            "process reaper", 32768);
+                   Thread t = null;
+                   if
(Boolean.getBoolean("processReaperUseDefaultStackSize")) {
+                       t = new Thread(systemThreadGroup, grimReaper,
"process reaper");
+                    } else {
+                       // Our thread stack requirement is quite modest.
+                       t = new Thread(systemThreadGroup, grimReaper,
"process reaper", 32768);
+                    }
                      t.setDaemon(true);
                      // A small attempt (probably futile) to avoid priority inversion
                      t.setPriority(Thread.MAX_PRIORITY);



For test case please check the attached file.

Comments
URL: http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/460323d4a285 User: lana Date: 2016-03-14 15:55:07 +0000
14-03-2016

URL: http://hg.openjdk.java.net/jdk9/hs-rt/jdk/rev/460323d4a285 User: kevinw Date: 2016-02-29 12:16:28 +0000
29-02-2016

No guarantees, but we carry a local patch to compute tls size using glibc internals. Find it here: http://cr.openjdk.java.net/~martin/webrevs/openjdk9/tls-size-guarantee/
26-01-2016

The glibc bug report now mentions our difficulties in Java https://sourceware.org/bugzilla/show_bug.cgi?id=11787#c44
15-01-2016

AFAIK there is no direct way for the VM to know how much stack might be stolen by glibc before creating a thread with a given stack size. Further if native code later creates its own TLS data structures that would need to further added to any "default" stack sizes calculated during VM startup. Open to practical suggestions.
14-01-2016

I don't think we have consensus yet that we can't or shouldn't fix this in hotspot. Any user-specified stack size should be in addition to any OS overhead, including native thread local storage. This problem has been known for years; not sure that glibc folks will do anything on their side to fix things.
13-01-2016