JDK-6509032 : (bf) Monomorphic implementations for Direct and Heap versions of X-Buffer
  • Type: Enhancement
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 6
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 2007-01-03
  • Updated: 2023-12-02
Related Reports
Blocks :  
Blocks :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
NIO is currently split into two implementations, one for Java-heap-resident storage and one for other storage (C-heap or static or mapped).

The two implementations are parallel in structure, and do essentially the same thing, but for two different kinds of references:
	type	ref	store
	Heap	a[n]	a[n] = x
	Direct	addr	unsafe.putInt(addr, x)

(There are other degrees of freedom also, according to element type and endian-ness.)

To handle bulk copies between the two kinds of buffers, there are JNI functions (native methods) which perform heterogeneous array copies.

It would probably be better to use a unified form of reference (for both Java- and C-heap storage).
This would let us have one set of methods for both Heap and Direct buffers.

Details:

The usual problem with mixing heap and non-heap pointers this way is that the GC can *move* a live object, which would leave the direct address dangling.
This can be avoided by the technique Unsafe uses for working generically with heap and non-heap pointers.
Each address is represented not by a byte[] and int offset, or a long address, but by an unsafe pairing of an Object reference and a long offset.

For Java-heap storage, the Object reference points (always) to the header of the heap object, and the long offset is a small number (e.g., 12, 28, etc.).

For C-heap storage (or static storage), the Object reference is (always) null, and the long offset is the absolute address of the storage.

This way, there is always a GC-able reference (or null), and an offset.  (Note that null + address must be the same as 0 + address in this scheme.)
Machine instructions which want to load or store the addressed element can blindly do a double-indexed load at *(oopref + offset), and it will always go to the right place.

Recode NIO buffers to use one concrete implementation with the unified Object/offset format.
Thebn common (unsafe) code can implement both Direct and Heap cases.
This would be an advantage to performance, since the compiler would be able to inline monomorphic calls much more readily than the current bimorphic calls.

For bulk copies, an upgraded Unsafe.copyMemory operation is probably a good-enough way to bridge the gap between heap and non-heap.
It would use the above scheme of locating source and destination memory slices each by an Object/offset pair.
It can replace the private native methods in NIO, and (as a single well-known intrinsic) can be open-coded by the compiler.

One missing piece is byte-swapping logic, but we have a separate scalar byte-swap operation which might be acceptable.
(I'm open to suggestions about a byte-swapping, array-copying intrinsic, but I hope it's not necessary.)

Comments
Atomic access to ByteBuffers may be performed using a VarHandle. The VarHandle implementation uses the double addressing mechanism so that the same code works for on/off-heap access. This requires some minor enhancements to the fields of Buffer/ByteBuffer so the base and address can be used consistently (see JDK-8149469). It should be possible to leverage the same technique and reduce the number of internal buffer implementations. Careful analysis will be required to ensure off heap access performance does not regress. Currently one field read is required, but for a unified approach two field reads will be required (base, which will be null and the address). Where possible buffer fields should be marked final or @Stable.
03-03-2016