JDK-8282469 : Allow considered use of C++ thread_local in Hotspot
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 19
  • Priority: P4
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2022-03-01
  • Updated: 2022-03-22
  • Resolved: 2022-03-22
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 19
19 masterFixed
Related Reports
Blocks :  
Relates :  
Relates :  
Sub Tasks
JDK-8282721 :  
Description
In JDK 9 we looked at replacing library-based thread local storage (TLS) with use of C++ thread_local (JDK-8132510) but there some issues/concerns around the use of that and so we opted to use the compiler specific TLS mechanisms provided by gcc/clang/VS.

A significant limitation to the gcc TLS extension is that if an initializer is present for a thread-local variable, it must be a constant-expression. [1] That means that we can't declare a thread-local variable that is a class instance with non-trivial construction and destruction.

Project Panama has a usecase for TLS that requires a non-trivial destructor for a C++ class, such that threads that attach to the JVM to process Java "upcalls" will be automatically detached when the thread terminates (if it didn't detach explicitly).

A discussion on the pros and cons of using C++ thread_local as the mechanism for TLS in the JVM, shows there are still a number of concerns that argue against its wholescale adoption. Some relevant extracts from that discussion:

"[A] reminder that the difference between C++11 thread_local and the gcc's __thread came up in the discussion of JDK-8230877.  

https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-September/039487.html

That's what led us to the current restriction against using thread_local**.  We could revisit that.  thread_local usually requires an extra prologue before an access to ensure the variable has been initialized, while __thread requires the initializer be a constant expression. Also JDK-8230877 was before C++11/14 support and use was in place."

---

** The issue here is a potential performance hit. As the gcc documentation describes it [2]:

"Unfortunately, this [C++ thread_local] support requires a run-time penalty for references to non-function-local thread_local variables defined in a different translation unit even if they don't need dynamic initialization, so users may want to continue to use __thread for TLS variables with static initialization semantics."

Some preliminary benchmarking with gcc __thread converted to C++ thread_local did show some significant regressions on a couple of benchmarks on Aarch64.

---

"thread_local has all the same initialization order issues as globals. There's a nicely worked out analysis here:

https://stackoverflow.com/questions/60813372/initialization-order-of-thread-local-vs-global-variables

So I think I'd like us to stick with the limited version that requires a constexpr initialization expression, at least for the most part."

---

"We could relax the prohibition to allow thread_local where really required. I might want a noisy looking macro for that use-case, with bare thread_local remaining forbidden. That makes it clear that someone thought about the question at least a little bit.

I looked for a way to warn about uses of thread_local that could be locally disabled where we intentionally use it, but didn't find such a thing.  Clang (some version) has -Wglobal-constructors, and a patch exists for adding it to gcc, but it's not in gcc11.2 (the latest release).
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01860.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71482

But I did stumble over this.  Might this be a problem for you?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61991"

---

For the record gcc bug 61991 is not an issue for the proposed use-case because the TLS variable does get used.

---

"There are two possible maintenance issues:  1. If we don't document our decisions about our choices we won't be able to re-evaluate them later on without full re-analysis, so let's put the info into the JBS entry.  2. Even if the choices are fully documented, there's some cost and risk in applying the documented reasoning correctly in each case, compared to a "one size fits all" design.  But it seems like we have a plan to deal with those possible maintenance issues."

---



So the proposal here is to allow "well considered" uses of C++ thread_local, by providing a suitably "noisy" macro, and adjusting the Hotspot Style Guide [3] section on allowed C++ features to accommodate this.

[1] https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
[2] https://gcc.gnu.org/gcc-4.8/changes.html
[3] https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md
Comments
Changeset: 81d63734 Author: David Holmes <dholmes@openjdk.org> Date: 2022-03-22 01:20:31 +0000 URL: https://git.openjdk.java.net/jdk/commit/81d63734bc2e2a18063cb6afbc53f8813a0ba880
22-03-2022

A pull request was submitted for review. URL: https://git.openjdk.java.net/jdk/pull/7719 Date: 2022-03-07 06:12:03 +0000
07-03-2022

It appears that gcc/g++ can be built such that it uses glibc's __cxa_thread_atexit function, which uses calloc (as it is a C routine). But this seems not the case with the version we are using, and given we can't control it we have to assume that it could happen regardless of what the "official" gcc version does. The simple fix is to delete the operator_new.cpp file which redirects the global new/delete operators to be fatals, and instead rely on the link-time check to catch use of the global operators within hotspot code.
07-03-2022

The use of thread_local has run into a problem on debug builds. The initialization of the thread-local variable has to register a cleanup object to run the destructor at thread-exit. This requires allocating the cleanup object [1]: elt *new_elt = new (std::nothrow) elt; this is invoking the global operator new, which is prohibited in hotspot (intended to prevent hotspot classes from being allocated that way): # Internal Error (/scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/memory/operator_new.cpp:75), pid=11540, tid=11594 # fatal error: Should not call global operator new (gdb) where #0 0x00007f503723d387 in raise () from /lib64/libc.so.6 #1 0x00007f503723ea78 in abort () from /lib64/libc.so.6 #2 0x00007f503537333f in os::abort (dump_core=<optimized out>, siginfo=<optimized out>, context=<optimized out>) at /scratch/users/daholme/jdk-dev2.git/open/src/hotspot/os/posix/os_posix.cpp:2024 #3 0x00007f5036acd57e in VMError::report_and_die (id=id@entry=-536870912, message=message@entry=0x7f5036c12940 "fatal error", detail_fmt=detail_fmt@entry=0x7f5036d5e2f0 "Should not call global operator new", detail_args=detail_args@entry=0x7f4fe0e7fcd8, thread=0x7f4f6c203a40, pc=pc@entry=0x0, siginfo=0x0, context=0x7f50371b2ae0 <g_stored_assertion_context>, filename=0x7f5036d5e318 "/scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/memory/operator_new.cpp", lineno=75, size=0) at /scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/utilities/vmError.cpp:1750 #4 0x00007f5035b9c76b in report_fatal (error_type=error_type@entry=INTERNAL_ERROR, file=file@entry=0x7f5036d5e318 "/scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/memory/operator_new.cpp", line=line@entry=75, detail_fmt=detail_fmt@entry=0x7f5036d5e2f0 "Should not call global operator new") at /scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/runtime/thread.hpp:654 #5 0x00007f50366afc10 in operator new (size=<optimized out>, nothrow_constant=...) at /scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/memory/operator_new.cpp:75 #6 0x00007f5036be175e in __cxxabiv1::__cxa_thread_atexit (dtor=0x7f5036a28830 <UpcallContext::~UpcallContext()>, obj=0x7f4f6c160288) at /home/erik/git/jdk/open/build/devkit/src/gcc-10.3.0/libstdc++-v3/libsupc++/atexit_thread.cc:146 #7 0x00007f5036a2712d in __tls_init () at /scratch/users/daholme/jdk-dev2.git/open/src/hotspot/share/prims/universalUpcallHandler.cpp:66 We will need to permit the use of global operator new - at least in this case. [1] https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/atexit_thread.cc
03-03-2022