Summary
-------
This CSR refers to the latest iteration of the Foreign Function & Memory API originally targeted for Java 17, with the goal of further consolidating the API, as well as addressing the feedback received so far from developers.
Problem
-------
Real-world use of the Foreign Function & Memory APIs revealed some remaining usability issues, listed below:
* There is an asymmetry between the allocation API (`SegmentAllocator`) and the dereference API. More specifically, when allocating a segment from an existing Java value/array, a `SegmentAllocator` also accepts the `ValueLayout` corresponding to the value/array element, so that necessary alignment constraints and endianness can be applied. But the static dereference methods in `MemoryAccess` do not take any layout argument; instead, they optionally accept a `ByteOrder` argument, to perform byte swapping. This asymmetry can lead to subtle mistakes, where a segment is allocated as an array whose element is defined by a given layout, but then the array is accessed in ways that are incompatible with that layout.
* Some useful data types (`boolean` and `MemoryAddress`) are not supported by memory access var handles.
* The API makes excessive use of static methods. There is a class `MemoryAccess` containing several static dereference methods (see above), and the `CLinker` class also contains several static helper functions to e.g. convert a Java string to a C string and back.
* The `MemoryAddress` class is an entity with its own `ResourceScope` object. The reason for this choice is that a client can e.g. request the base address of a memory segment, and expect the address to keep a reference to the segment scope. But making `MemoryAddress` a scoped entity creates confusion in the more common case where an address is returned by a native call, in which case no spatial, nor temporal bounds are available.
* Memory layouts interacting with the `CLinker` API needs to be constructed in a special way; they need to embed special *layout attributes* which encode additional information which allows the linker runtime to classify the argument correctly, when a new downcall method handle is created. Also, there seems to be some redundancy in how downcall method handles are created: clients have to pass both a `FunctionDescriptor` *and* a `MethodType`, even though, in most cases, the information in the `MethodType` can be inferred from that in the `FunctionDescriptor`.
* Calling native functions using downcall method handles can be unsafe: consider the case where a segment is passed *by-reference* to a downcall method handle. In this case, the segment address is obtained, and then passed to the native call. If the segment is a backed by a shared scope, it would be possible for a client in another thread to close the segment scope concurrently - which might cause the native call to malfunction.
* The way in which dependencies between scopes are set up, using `Resource::acquire/release` is too low-level. There is no way to explicitly set up a temporal dependency between two scopes, w/o resorting to complex uses of `ResourceScope::addCloseAction`.
Solution
-------
Here we describe the main ideas behind the API changes brought forward in this CSR:
* The main change in this iteration of the API is that `ValueLayout` is now always associated with a Java carrier type. For this reason, the API features specialized subclasses, like `ValueLayout.OfInt`, `ValueLayout.OfLong` etc. The relationship between `ValueLayout` and a Java carrier simplifies the API in a number of ways:
- We can define a set of dereference methods accepting a (specialized) value layout subclass; for instance, instead of `getInt()` we can have a method like `get(ValueLayout.OfInt)`. This allows us to fix the asymmetry between the dereference API and the allocation API.
- We can use the carrier information attached to value layouts to decide how to classify parameters to downcall method handles. This effectively removes the need of accepting a (now redundant) `MethodType` parameter in `CLinker::downcallHandle`. This also makes the *layout attributes* machinery redundant, which is in fact removed in this iteration.
- We can attach constant var handles to value layouts, which means that obtaining a memory access var handle from a value layout can be far more efficient than before.
* Support for `boolean` and `MemoryAddress` has been added to memory access var handles. These carriers are considered *secondary* carriers (as opposed to *primary carriers*, such as `byte`, `short`, `char`, `int`, `float`, `long`, `double`). The reason for this distinction is that secondary carriers cannot be copied in bulk to and from memory segments, as each element require some adjustment (e.g. a `MemoryAddress` has to be *lowered* to a `long` value, while `boolean` has to be normalized to either `1` or `0`).
* The API has been significantly simplified, and some classes have been removed:
- The `MemoryAccess` class is no longer present. Instead, *instance* dereference methods are present in both `MemorySegment` and `MemoryAddress` (the latter are *restricted*, as an address has no bounds).
- The `MemoryLayouts` class is also removed. Value layout constants (`JAVA_INT` etc.) have been moved inside `ValueLayout` (while other layout constants have been dropped).
- Most of the static methods in `CLinker` (e.g. to convert from Java strings to C strings and back) have been moved to `MemorySegment`, `MemoryAddress` and `SegmentAllocator`. The platform-dependent layout constants in `CLinker` (`C_INT` etc.) have been dropped. It is the role of extraction tools to generate layouts for basic C types that are compatible with a given target platform.
- The `CLinker.TypeKind` enum has been removed (as it is no longer attached to layouts for classification purposes).
- The `VaList` class has been moved to toplevel.
* `MemoryAddress` no longer features a `ResourceScope` accessor. That is, `MemoryAddress` denotes a raw machine address, and has no notion of spatial and temporal bounds associated with it. Clients can no longer obtain the base address associated with heap segments (e.g. `MemoryAddress` is for off-heap access only). When parameters are passed by-reference to a downcall method handle, the method handle now takes an `Addressable` parameter, not a `MemoryAddress` one. This change allows memory segments to be passed to downcall method handles more directly; the linker runtime will try to keep such arguments alive for the entire duration of a native call. This greatly enhances the safety of the `CLinker` API, and reduces the number of conversions required in user code.
* Since `MemoryAddress` no longer has a `ResourceScope`, a new entity named `NativeSymbol` has been added, which represents a symbol in a library (either a function or a global variable). A `NativeSymbol` has a scope and a name, and is accepted by `CLinker::downcallHandle` when creating downcall method handles. Also, `CLinker::upcallStub` returns a new (anonymous) `NativeSymbol`, which points to the native function generated by the VM which calls back to the target Java method handle provided at creation. The scope attached to a native symbol can be closed at any time, and will cause the symbol to be unloaded. Again, `CLinker` will make sure that a native symbol scope cannot be closed *while* in the middle of performing a native call.
* The `ResourceScope` class contains some simplifications: first, there's no longer a distinction between implicit and explicit scopes. All scopes (but the global scopes) are explicit and can be closed. Some scopes are additionally associated with a `Cleaner` instance. Secondly, a new method `ResourceScope::keepAlive(ResourceScope)` has been added to replace the pair of `ResourceScope::acquire/release` as well as the `ResourceScope.Handle` class.
Specification
-------------
A specdiff of the changes as of November 11th, 2021 has been attached to this CSR (v3).
A link of the latest javadoc (as of November 11th, 2021) is included below:
http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/javadoc/jdk/incubator/foreign/package-summary.html
A link of the latest specdiff (as of November 11th, 2021) is included below:
http://cr.openjdk.java.net/~mcimadamore/JEP-419/v3/specdiff_out/overview-summary.html