JDK-8307817 : AARCH64: make macOS W^X locking more fine grained
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: compiler
  • Affected Version: 21
  • Priority: P4
  • Status: In Progress
  • Resolution: Unresolved
  • OS: os_x
  • CPU: aarch64
  • Submitted: 2023-05-10
  • Updated: 2024-05-13
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
On Apple Silicon the Writer/Execute lock is a new Hardened Runtime capability, see:
https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon

It prevents memory regions to be writable and executable at the same time. Therefore, we need to acquire WXWrite when we want to write to the code cache.

At the moment, the write lock is acquired by
```
MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, thread));
```

Acquiring write lock can be expensive and is too coarse grained at the moment. 
Check all the write lock if they can be move down in the call hierarchy, especially the ones in `interfaceSupport.inline.hpp`

In https://bugs.openjdk.org/browse/JDK-8302736 WXWrite locks caused a major performance regression and was resolved by moving one of the locks already down. 


Comments
[~dlong] Sure. I will assigned it to you
13-05-2024

[~tholenstein], as part of JDK-8328306 I plan to implement fine-grained guards, which I believe will allow me to use those guards as either asserts, or as transition points (allowing a full lazy mode). Would you like me to take over this issue, or would you still like to keep it?
12-05-2024

Can't we just come from the other side and enable WXWrite on demand? I've made the prototype (https://github.com/openjdk/jdk/pull/18762), Vladimir and Andrew are critiquing it, but in general can this approach be improved or is it totally doomed?
12-04-2024

Also, there seems to be a trade-off between most secure (WXExec by default, with fine-grained WXWrite), and best performance (lazy switching between Exec, Write, and DontCare).
19-03-2024

I was thinking how we could combine a lazy switching approach with the existing scope-based ThreadWXEnable. I think we would need to introduce a third state, called something like WXAny or WXDontCare. The ThreadWXEnable destructor would remain in the current state if restoring to a previous WXDontCare state.
19-03-2024

"Acquiring a write lock" refers to toggling between writable and executable states for memory. This term "lock" does not imply a conventional lock, such as a mutex, which is used for thread synchronization. Rather, it denotes changing the state of memory pages from writable to executable and vice versa, serving as a method of access control, not mutual exclusion. Therefore, the use of "locking" in this context may be somewhat misleading—apologies for any confusion. Regarding the observation, "I saw DMB and ISB barrier instructions in addition to writing to the special register," it's noteworthy that pthread_jit_write_protect_np is proprietary to Apple, and its internal workings are not fully disclosed. However, DMB (Data Memory Barrier) and ISB (Instruction Synchronization Barrier) instructions suggests an effort to ensure the sequential ordering of memory operations and to propagate these changes throughout the processor. This likely involves synchronizing all caches and pipelines to align with the current W^X memory state. Probably this is what makes pthread_jit_write_protect_np a performance-expensive operation.
19-03-2024

I don't think there is a "lock", like a mutex, but just different states enforced by the hardware. When I single-stepped through it in the debugger I saw DMB and ISB barrier instructions in addition to writing to the special register.
19-03-2024

"Acquiring write lock" Is pthread_jit_write_protect_np actually locking? Or is it just slow? Do we know what it does internally? Otherwise, what are the Locks this issue mentions?
19-03-2024

Note: comments in https://github.com/openjdk/jdk/pull/13606 suggest an alternative approach of handling WXWrite states.
11-05-2023

The high-level placement of these calls was done to stop playing whack-a-mole every time we hit a new failure due to a missing ThreadWXEnable. I'm all for placing these where they are actually needed but noone seems to be to able to clearly state/identify exactly where that is in the code. The trade-off of course is that if we push this too far down we may have to execute it far more often and so take a performance hit. So figuring out the optimum placement for these in the call stack seems rather difficult and may need to be evaluated on a case-by-case basis.
11-05-2023