JDK-8254163 : Implementation of Foreign-Memory Access API (Third Incubator)
  • Type: CSR
  • Component: core-libs
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 16
  • Submitted: 2020-10-07
  • Updated: 2020-11-09
  • Resolved: 2020-10-22
Related Reports
CSR :  
Description
Summary
-------

This CSR refers to the third iteration of the Foreign Memory Access (an incubating Java API) originally targeted for Java 14 (and later re-incubated in 15), with the goal of refining some of the API rough edges, as well as addressing the feedback received so far from developers.

Problem
-------

Real-world use of the Foreign Memory Access API revealed some remaining usability issues, listed below:

 * The API's bias towards confinement is still too prominent, despite the addition (in 15) of an API point to cooperatively change thread ownership of a given segment (serial confinement); the main issue is that, even with the new API, it is still not possible to *close* a memory segment from a thread other than the owner thread, and that makes memory segment hard to use with certain API idioms
 * While deterministic deallocation is one of the core foundations of the Foreign Memory Access API, users have expressed interest in having the ability to optionally register segments against a `Cleaner` instance, so that segment deallocation could be guaranteed regardless of whether an explicit call to `MemorySegment::close` is made
 * The distinction between checked and unchecked memory addresses is too subtle. Checked addresses have a reference to the owning segment, and can be safely dereferenced, whereas unchecked addresses can't. This leads to verbosity in clients attempting to dereference a `MemoryAddress` instance, as the client has first to query if a segment is available, and then act accordingly.
 * Forcing developers to exclusively use `VarHandle` when it comes to dereference memory is a bit too much; there are cases where the expressive power of var handles come in handy (e.g. in concurrent contexts, when fencing is required, or when structural access is needed), but in simpler cases it would be nice to have a ready-made collections of dereference API points which can be used in a type-safe way.
 * While a memory address and a memory segment generally represent separate concepts, it is possible to go from the latter to the former (e.g. by calling the `MemorySegment::baseAddress` method). Our experiments with the `jextract` tool revealed that the need for this explicit conversion has often a pretty big impact in the verbosity of clients interacting with the native bindings generated by jextract.

Solution
--------

Here we describe the main ideas behind the API changes brought forward in this CSR:

 * First, the notion of shared memory segment is introduced. A shared memory segment is created by invoking the `MemorySegment::share` method that returns a new memory segment instance, which is backed by the same memory region, and has no owner thread. This means that the segment will be effectively accessible (*and* closeable) from multiple threads (possibly in a concurrent fashion). The safety promises of the Foreign Memory Access API are preserved by a sophisticated lock-free synchronization scheme which relies on thread-local handshakes (see JEP 312).
 * A new method is added to memory segments, `MemorySegment::registerCleaner(Cleaner)` that returns a new segment instance, which is backed by the same memory region, and supports implicit deallocation.
 * Memory access var handles now use memory segments as the basic dereference unit. That is, the most basic memory access var handle accepts two coordinates: the segment to be dereferenced and the offset within the segment at which the dereference should occur. All other, structural memory access var handles can be composed from this basic one. As a result, `MemoryAddress` has reverted back to being a dumb carrier for an object/offset unsafe addressing coordinate pair. That is, there is no way to go from an address back to a segment (except *unsafely* via `MemoryAddress::asSegmentRestricted` but that creates a *new* segment).
 * A new class, namely `MemoryAccess` is introduced. This class contains several dereference methods to get and set values in memory, with different Java carriers. There are different access modes supported: a basic one which simply takes a memory segment and dereferences it at its base address; other, more sophisticated, modes allows a byte offset or a logical index to be passed as well, to support common array-like access idioms.
 * To better capture the commonality between a memory segment and a memory address a new interface, namely `Addressable` has been introduced. This interface describes entities that can be mapped into a memory address. This interface is implemented, trivially, by `MemoryAddress` and also by `MemorySegment` (a memory segment can always be projected to its base address). We plan, with subsequent JEPs (e.g. JEP 389) to add more implementations of this interface.
 * The hierarchy of interface `MappedMemorySegment <: MemorySegment` poses issues when it comes to `VarHandle` invocations; since now the memory access var handle take a `MemorySegment` access coordinate, calling such var handles with a `MappedMemorySegment` coordinate will result in a non-exact var handle call, thus greatly degrading performances; we have decided to drop the `MappedMemorySegment` interface and instead introduce an helper class, namely `MappedMemorySegments`, which contains _static_ methods which provide the same functionality as what was provided by `MappedMemorySegment`.

Specification
-------------

Here are some useful links which should help in navigating through the changes in the API.

Javadoc:

http://cr.openjdk.java.net/~mcimadamore/8254162_v5/javadoc

Specdiff:

http://cr.openjdk.java.net/~mcimadamore/8254162_v5/specdiff_out

Pull request:

https://github.com/openjdk/jdk/pull/548

In addition, a specdiff of the changes as of November 9th 2020 has been attached to this CSR.



Comments
Another minor upload, following review comments: * in MemoryAccess, removed the endianness accepting overloads for getByte/setByte * also in MemoryAccess, remove getByteAtIndex and setByteAtIndex - which were just alias for getByteAtOffset and setByteAtOffset (since carrier size is one byte). To help reviewing changes, I've uploaded a delta specdiff (specdiff_delta_v5).
09-11-2020

I've uploaded some minor changes to the API which were requested as a result of the review process. The changes are minor, and mostly revolving around support for mapped memory segments: * the file descriptor accessor has been replaced with a boolean predicate (isMapped) * the factory for mapped segment has been renamed from mapFromPath to mapFile There were also some minor editorial corrections. Another change was in MemoryLayouts, where, after running tests on 32 bit platforms we realized that the alignment constraints of long/double layout constants on 32 bits machine was over-promising and incorrect (e.g. layout constant reported alignment of 64 bit where in reality 32 bit VM aligns those at 32 bit). These changes are in the v4 specdiff (I've also attached a delta compared to previous iteration).
04-11-2020

Moving to Approved.
22-10-2020

Addressed the comments regarding overloads in MemoryAccess. New specdiff uploaded and links updated to latest version. Moving CSR to Finalize.
19-10-2020

Regarding `MemoryAccess`, I spoke with the rest of the team and did an experiment, and also did some performance evaluation. What you suggest seems cleaner API-wise and has same performance characteristics, so we will go for it. I will update a new specdiff including these changes when this CSR is finalized.
16-10-2020

I thought another possible design for the trios of methods in MemoryAccess (method, method_LE, method_BE) would be (method, method with byte order parameter) so their would be pairs of methods instead of triples. For the default methods, for an incubator API it isn't critical either way. If the interface ends up being sealed, parties outside of the API implementation won't be impacted of course. For interfaces intended to be extended or implemented used by other parties, the implSpec tags allows the self-use of the default method code to be documented separate from the method's overall spec, which is helpful for subtype to be correct in their usage.
16-10-2020

Thanks for the review and the comments. > Is it for performance that one overload of methods on MemoryAccess don't take a bye order parameter? Uhm - not sure... looking at the code, we should have _LE and _BE for all methods - except for byte-related ones for which it makes no difference of course. Which one is missing? Or do you refer to the fact that we could, instead of adding _LE/_BE prefix just add a ByteOrder parameter? Yes, here we wanted to have thin wrappers around VarHandle calls w/o any form of control flow (which most of the times is damaging to escape analysis). The idea was also to reflect what was exposed by MemoryLayouts constants. > The default methods in MemoryAddress would be better with some implSpec for the default implementation. Honestly, I'm not 100% clear as to where the @implSpec tag should be used; note that MemoryAddress is meant to be a non-implementable class for all intents and purposes (just we don't have "sealed" yet...). So all the contents in the javadoc really is method specification, rather that specification of the default method implementation. E.g. I used "default" mostly as an implementation convenience. Maybe the problem here is that I'm a bit abusing of "default" methods? I can easily turn these into non-defaults, if that's preferred. > Was there consideration of marking restricted methods in some other way beside their name, such as a @Documented runtime-retention annotation? The plan here is to file another JEP which addresses solely the restricted method feature; this will have javadoc improvements as well. As for now, we don't want to add public types for something that is meant to be a temporary workaround while something better is available.
16-10-2020

Moving to Provisional. Is it for performance that one overload of methods on MemoryAccess don't take a bye order parameter? The default methods in MemoryAddress would be better with some implSpec for the default implementation. Was there consideration of marking restricted methods in some other way beside their name, such as a @Documented runtime-retention annotation?
14-10-2020