JDK-8312523 : Implementation of Foreign Function & Memory API
  • Type: CSR
  • Component: core-libs
  • Sub-Component: java.lang.foreign
  • Priority: P3
  • Status: Closed
  • Resolution: Approved
  • Fix Versions: 22
  • Submitted: 2023-07-21
  • Updated: 2023-10-06
  • Resolved: 2023-10-06
Related Reports
CSR :  
Relates :  
Relates :  
Relates :  
Description
Summary
-------

This CSR refers to the finalization of the Foreign Function & Memory (FFM) API that first started previewing in Java 19.


Problem
-------

Feedback and careful examination of the FFM API revealed the following issues:

 1. The FFM API offers ways to easily translate Java strings into native strings. But, the produced native strings are currently limited to the UTF-8 encoding.
 2. The MemoryLayout::sequenceLayout(MemoryLayout) factory method, which produces a sequence layout with a maximum element size (dervied from Long.MAX_VALUE) is a pitfall. If a user simply forgets to specify the size of the sequence, issues can occur later down the line. Such as: failure to allocate such a sequnce, or failure to link a function using a struct layout containing one of these sequences.
 3. While jextract can be used to automatically derive the layouts of native types, clients that don't use jextract are left figuring out what the layout of a native type is on their own.
 4. There are cases where the VarHandles and MethodHandles derived from memory layouts fall short, because a certain case can not be fully represented using memory layouts, for instance because it involves an array whose size is not statically known.
 5. The MemorySegment::segmentOffset method is no longer needed. The same thing can be trivially done by taking the difference between the base address of 2 segments.
 6. Calls to SegmentAllocator::allocateArray can be ambiguous. If this method is called with a single `long` argument `x`, which is intended to allocate an array containing the single `long` value `x`, instead clients will get an array of `x` uninitialized `long`s.
 7. The documentation of variadic functions mentions prototype-less functions, which are not variadic according to the C spec. This spec is also not as clear as could be.
 8. The Linker API is currently optional, which makes it impossible for the JDK to use it itself to implement access to native functions.
 9. The FMM API is currently in preview. We want to move it out of preview.
 10. The name of the isTrivial linker option is not descriptive enough of what the linker option does.
 11. It is currently possible to combine the isTrivial linker option with the captureCallState linker option, which constrains the implementation of future features.
 12. The Arena::allocate method is currently a default method. But, it is an important method that implements of Arena should override.
 13. When using an unsupported access mode and accessing a mis-aligned address using a memory access var handle, we currently throw an IllegalArgumentException, instead of an UnsupportedOperationException.
 14. There are several SegmentAllocator::allocateFrom methods that allocate an initialize the allocated memory. For those methods, we can skip memory zeroing of the allocated memory, since we know we'll be overwriting the memory right away any ways. But, the current API only allows initializing the allocated memory from a primitive value or a Java array.
 15. Executable jar files are allowed to specify Add-Opens and Add-Exports in their manifests, which function equivalently to the --add-opens and --add-exports command line flags of the java launcher. However, there is no equivalent manifest attribute for --enable-native-access.

Solution
-------

The proposed solutions to each of these issues, in order, are:

 1. Expand the set of supported encodings to all encodings found in the `java.nio.charset.StandardCharsets` class.
 2. Remove the MemoryLayout::sequenceLayout(MemoryLayout) factory method. Clients should instead use the MemoryLayout::sequenceLayout(long, MemoryLayout) factory method, and specify the element count that they want explicitly.
 3. A new API, Linker::canonicalLayouts(), is added. For native linkers, this API can be used to find the memory layouts of the most common primitive types of the C language.
 4. Change the var and method handles derived from memory layouts to accept an additional 'base offset' coordinate. When a handle is used, the value of this coordinate is added to the offset computed done by the handle. This effectively allows these handles to be composed with other offset computation code. This can help address cases where a particular memory access can not be fully represented using the memory layout and layout path APIs.
 5. Remove the MemorySegment::segmentOffset method.
 6. Rename the methods in SegmentAllocator that allocate and intialize a memory segment to 'allocateFrom'. This avoids the aforementioned ambiguity.
 7. Clarfiy the documentation of variadic functions, and drop the references to prototype-less functions.
 8. Make the linker a required API, but dropping the exception specification on the Linker::nativeLinker method stating that on unsupported platforms it throws an UnsupportedOperationException.
 9. Move the FFM API out of preview, by removing `@PreviewFeature` annotations, and updating `@since` tags in javadoc.
 10. Rename the Linker.Option.isTrivial API to 'critical'. The name name leans on the established meaning of 'critical' in the context of JNI. We have to keep the name somewhat vague to avoid making too many promises about what it does, as not every linker implementation is required to implement it.
 11. Disallow isTrivial from being combined with the Linker.Option.captureCallState option. This keeps the door open for future enhancements.
 12. Make the Arena::allocate method abstract.
 13. Update the implementation to throw an UnsupportedOperationException for unsupported access modes, even in the case of a mis-aligned access
 14. Add a new SegmentAllocator::allocateFrom overload that allows initializing the allocated memory from an arbitrary memory segment.
 15. Add an `Enable-Native-Access` jar manifest attribute that functions equivalently to the --enable-native-access command line flag of the java launcher.

Specification
-------------

A specdiff of the changes is available below:

 * https://cr.openjdk.org/~jvernee/jep22_specdiff/v3/ (2023/08/11. Commit hash 141096b)
 * https://cr.openjdk.org/~jvernee/jep22_specdiff/v4_inc/ (2023/09/11. Commit hash 0e702f0. Incremental)
 * https://cr.openjdk.org/~jvernee/jep22_specdiff/v5_inc/ (2023/09/28. Commit hash 72650c4. Incremental)

Note that particular item #4 has a wide spread impact on the javadoc of the API.

. #15 does not reflect in any javadoc changes, so I'll describe the feature here:

The value of the `Enable-Native-Access` manifest attribute is, at least for the time being, restricted to the 'ALL-UNNAMED', indicating that native access is enabled for unnamed modules. This is similar to how the Add-Opens/Add-Exports attributes only grant access to unnamed modules. When a value other than 'ALL-UNNAMED' is specified, the JVM will be abort during launch with an error message.

References
-------------
 1. https://github.com/openjdk/panama-foreign/pull/836
 2. https://github.com/openjdk/panama-foreign/pull/838
 3. https://github.com/openjdk/panama-foreign/pull/839
 4. https://github.com/openjdk/panama-foreign/pull/840
 5. https://github.com/openjdk/panama-foreign/pull/841
 6. https://github.com/openjdk/panama-foreign/pull/845
 7. https://github.com/openjdk/panama-foreign/pull/846
 8. https://github.com/openjdk/panama-foreign/pull/850
 9. https://github.com/openjdk/panama-foreign/pull/853
 10. https://github.com/openjdk/panama-foreign/pull/859
 11. https://github.com/openjdk/panama-foreign/pull/856
 12. no patch in panama-foreign
 13. https://github.com/openjdk/panama-foreign/pull/876
 14. https://github.com/openjdk/panama-foreign/pull/878
 15. https://github.com/openjdk/panama-foreign/pull/843


Comments
Thanks for the update [~jvernee], moving to Approved.
06-10-2023

Maurizio has added himself as a reviewer, and I've attached a .tar file with the specdiff. I've also linked related implementation issues from the implementation ticket ([1]). Based on those things, I've gone ahead and finalized the request. [1]: https://bugs.openjdk.org/browse/JDK-8312522
05-10-2023

Moving back to Draft. [~jvernee], please have one or more engineers add themselves as reviewers of the CSR before Proposing or Finalizing it. For both archival and review purposes, some stand-alone representation of the API needs to be associated with the CSR, such as a tar/zip of the specdiff.
03-10-2023

I've updated the CSR for the final version of the JEP PR and finalized. A parallel effort has been started by Maurizio to add an `@Restricted` annotation, and to start leveraging it in javac: https://github.com/openjdk/jdk/pull/15947
28-09-2023

I've updated the CSR according to the latest state of the JEP PR. I've added two new items, 13 and 14, for last minute API additions that we've added to the JEP. I've also uploaded an incremental spec diff with all the new changes since the last review. (please let me know if you prefer a full diff).
11-09-2023

[~darcy] Thanks for the review. I think Maurizio has covered most of the reply I wanted to give already. I looked at adding a link to the JNI spec about critical regions, but unfortunately there doesn't seem to be a top level description. The term is just briefly mentioned in the documentation of GetPrimitiveArrayCritical & ReleasePrimitiveArrayCritical: https://docs.oracle.com/en/java/javase/20/docs/specs/jni/functions.html#getprimitivearraycritical-releaseprimitivearraycritical > After calling GetPrimitiveArrayCritical, the native code should not > run for an extended period of time before it calls > ReleasePrimitiveArrayCritical. We must treat the code inside this pair > of functions as running in a "critical region." Inside a critical > region, native code must not call other JNI functions, or any system > call that may cause the current thread to block and wait for another > Java thread. (For example, the current thread must not call read on a > stream being written by another Java thread.) But, note that most of this text does not apply to FFM since we can not interact with the Java world from native code, other than doing an upcall back into Java. i.e. there doesn't seem to be a well-established concept that we can lean on. The doc we currently have says this: * A critical function is a function that has an extremely short running time in all cases * (similar to calling an empty function), and does not call back into Java (e.g. using an upcall stub). Does that seem good enough? We wouldn't mind suggestions for better names. The tricky part is trying to avoid a name that promises too much about what this option does, since that is ultimately an implementation detail. The critical option is meant as a hint that an implementation _may_ do something with, but doesn't have to. What we've done now is essentially pick a term ("critical") that has _some_ intuitive meaning (due to JNI), but then we mostly specify the meaning we want the term to have.
06-09-2023

[~darcy] - yes, the word "critical" is meant in the JNI sense (in fact it could be though of as a compatibility option). Mentioning the JNI spec is a very good idea! I understand the suggestion - unfortunately we could not come up with an O(1) solution. That said, we have hope that Valhalla will provide a path to become O(1) in the future (e.g. ValueLayout<X>, or some subtype of ValueLayout that has a specializable type parameter). So that's what we've been trying to keep in mind as we designed the API. I agree having a better story for marking restricted methods would be better. Not sure if we can deliver it for 22. But we can try to raise the priority on that one. At the very least the javac and annotation work should not be very controversial (while I can see us spending a lot of time designing the "perfect" javadoc rendering).
31-08-2023

[~mcimadamore], if the word "critical" is being used in the JNI-sense, I think it would be reasonable to include some kind of link /cross-reference into the JNI docs. Conceptually, taken together ValueLayout.OfDouble, ValueLayout.OfInt, etc. are kind of like an enum: "here are N instances of a type that range over some fixed set." If they could be declared as a enum there would be just the enum class (and perhaps an interface implemented by the enum class, a pattern we use in several places in the JDK) rather than the sealed interface and N nested classes. I understand the sealed interface + nested types is used to get more precise typing information. The request/suggestion is to make sure there isn't a O(1) in the set of types solution here rather than an O(n) solution. I think it would be preferable for auditability, etc. to have a technique for marking restricted methods settled on before this work is finalized.
30-08-2023

[~darcy], some replies to your questions: * additional C types: we initially included sized C types, but then backed off. Not because it can't be done (we can add as many as we want), but simply because the main use case we wanted to support was non-sized types, for which layouts are difficult to get right as they are platform specific. * yes, we changed from "trivial" to "critical" mostly to evoke the JNI equivalent. What we found is that not many folks understood what "trivial" was about. There's also the possibility we will add support for heap pinning as a boolean flag when constructing the "critical" option (which then will bring it fully in sync with its JNI equivalent). We're aware that "critical" means a different thing in different contexts - but we believe the description the JNI spec provides is quite accurate: "After calling GetPrimitiveArrayCritical, the native code should not run for an extended period of time before it calls ReleasePrimitiveArrayCritical. We must treat the code inside this pair of functions as running in a "critical region." Inside a critical region, native code must not call other JNI functions, or any system call that may cause the current thread to block and wait for another Java thread. (For example, the current thread must not call read on a stream being written by another Java thread.)" * What do you mean by "reducing footprint of layout constants" ? If you mean remove them completely, I don't think that would be a good idea, since _every_ memory segment access needs a layout constant to work (so asking user to create the layout first won't work). Unaligned constants are also very useful, as in most cases we found in panama-dev, developers did not care about alignment (esp. when it comes to on-heap access) and didn't want to be bothered by exceptions. I've been thinking of moving constants like JAVA_INTEGER into the Integer class (e.g. Integer.LAYOUT) - but while that works for aligned constant, we still need a place for unaligned constants. And for ADDRESS as well. If by footprint you mean having different subclasses (ValueLayout.OfInt vs. ValueLayout.OfLong) I think that is pretty much how the API goes about disambiguating memory access calls (see memory segment getters). * Re. restricted methods, yes, we do have plan to add an internal annotations, javac lint warnings and javadoc support (but not as part of this JEP/CSR/PR) - see: https://mail.openjdk.org/pipermail/jdk-dev/2023-August/008075.html.
30-08-2023

[~jvernee], thank you for submitting the CSR early in the release cycle. Moving to Provisional, not Approved. A few questions/comments: Are there any additional types from C99 and later's `<stdint.h>` that should be included in linker support? A type like `int32_t1`a 32-bit integer, would seem useful for Java <-> C interop. The CSR but not the javadoc discuss the origin of the name "critical" for Linker.Option.critical. I recommend another round of naming consideration as "critical" as a word outside of JNI has meanings that are not necessarily helpful here. Revisiting an old topic, was any consideration given to reducing the API footprint of the ValueLayout constants for primitive types? Another revisit, are the plan for a program-discoverable marker of "restricted" methods, such as an annotation? https://bugs.openjdk.org/browse/JDK-8282192?focusedCommentId=14493196&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14493196
29-08-2023

Hey [~darcy]. No worries, I'm off for the coming two weeks as well.
18-08-2023

[~jvernee], with summer activities, I'll need some more time before taking a look at this request. Thank you for filing it early in the release cycle.
18-08-2023