JDK-4616656 : Need a more portable way to unsafely access fields in JDk 1.4
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 1.4.0
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2001-12-20
  • Updated: 2012-10-08
  • Resolved: 2002-05-02
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
1.4.1 hopperFixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description

Name: pa48320			Date: 12/20/2001


[This problem should be assigned to John Rose ###@###.###,
 it knows about it]
There is currently no way to implement a JDK 1.4 IA-64 compatible VM
because with some issues (described below) with the Unsafe Implementation.

The following is a copy of the email thread describing the problem and
the solution (the email thread was an exchange between John Rose from
Sun & Herbert Czymontek from Oracle):

> 
> John,
> 
> Good point. Well, having 'long' offsets instead of 'int' offsets and a null
> base address for static fields would indeed solve the problem. Of all the
> solutions mentioned so far this is definitely my favourite.
> 
> Sorry, I didn't reply to your earlier e-mails, but I returned from vacation
> just today.
> 
> > Moreover, with static fields, it usually is the case that there is some
> > internal VM data structure within 2**31 bytes of the static field.
> > In the case your engineer suggests ("array of the particular field
> > types"), the array itself could serve as a base address to the unsafe
> > reference.  Show me a present application with 2**20 static variables,
> > and then I will begin to worry that the 2**31 limit is an issue.
> 
> Consider the following case running on a 64-bit architecture:
> Static fields are divided into two groups, references and non-references.
> The references are kept in a Java array allocated on the Java object heap,
> the non-reference fields are kept in a dynamically allocated (malloc'ed)
> array. Let us also assume that we are dealing with a very big continous Java
> heap (giga range) which is mapped into high address space (0x100000000 and
> higer). Now if the malloc heap manager happens to allocate the non-reference
> field array at a low address (0x7FFFFFF or lower) then a 32-bit offset to
> access static fields becomes insufficient.
> 
> > Yes, the GC may need help with these base addresses, if they are not
> > already in the GC's heap.  (Many newer VMs tend to allocate internal
> > data structures on the same heap as Java objects, reducing
> > fragmentation and providing other advantages.)  If the GC does not
> > recognize the base address of the static-holding structures, perhaps
> > it can be made to do so.  I understand this is not always desirable.
> 
> Well, you've already said it...not desirable.
> 
> > (By the way, the documentation states that a value produced by
> > staticFieldBase should not be used in any way other than as an argument
> > to get and put routines.  This could be clearer, but since the value is
> > a reference which may go on the JVM stack, it can also be stored in Java
> > variable.  We will clarify this in a future rev. of the documentation.)
> 
> Excellent point (about the JVM stack) which is yet another argument together
> with the previous one against the current implementation because it will
> require some support from the GC to be tolerant when it encounters these
> 'cookie' objects.
> 
> > There are at least two intermediate options between the fall-back
> > position stated above and having the GC be totally aware of base
> > addresses of static-holding structures.  One is to reserve a range
> > of 32-bit int values not in use as object offsets, and use them
> > to encode static fields.  Suppose the VM never uses negative offsets.
> > Then fieldOffset of a static field would return a negative cookie,
> > and Unsafe.getInt could compile to the equivalent of:
> >
> > if (offset >= 0)
> >   return *(int*)( (char*)o + offset );
> > else
> >   return ((Class*)o)->getStaticInt(offset);
> >
> > ...where the getStaticInt function is an appropriate VM internal thing.
> > That is, getInt expands internally to a variant getStaticInt which only
> > works for static fields.
> >
> > A second option would be to make the VM's implementation of fieldOffset
> > return INVALID_FIELD_OFFSET for static fields, and require the Java code
> > using Unsafe to watch for such occurrences.  This is probably the
> > simplest fix, in conjunction with Ken's suggested change to the
> > reflection code.
> 
> You cannot assume that a VM never uses negative field offsets. As a matter
> of fact it is very desirable to use them because this allows for the
> co-location of reference fields in objects. The rule is, e.g. all reference
> fields have negative offsets, all non-reference objects have positive
> objects. Example:
> 
> class foo
> {
>     int a;
>     Object b;
> };
> 
> class bar extends foo
> {
>     int c;
>     Object d;
> };
> 
> would result in the following object layout (offsets are just examples):
> 
> foo:        a    offset    8
>               b    offset  -8
> 
> bar:        a    offset    8
>               c    offset    12
>               b    offset    -8
>               d    offset    -16
> 
> This results in good locality for references as well as easy handling of
> references in the GC, etc.
> 
> Thanks,
> Herbert.
> 
> ----- Original Message -----
> From: "John Rose" <###@###.###>
> To: "Herbert Czymontek" <###@###.###>
> Cc: "Peter J. Allenbach (SUN)" <###@###.###>; "Kenneth
> Russell (SUN)" <###@###.###>; "John Rose (SUN)"
> <###@###.###>; "Michel Trudeau"
> <###@###.###>
> Sent: Tuesday, December 11, 2001 6:29 PM
> Subject: Re: Unsafe Implementation
> 
> >    This would be 100% portable and use already existing functions, e.g.
> something like
> >
> >    jlong JNICALL  Unsafe_StaticFieldAddr(JNIEnv *env, jclass clazz,
> jobject field);
> >
> > No, it would not port to VMs in which static variables change their
> > addresses from time to time (e.g., due to GC or object swapping).
> > Anything that moves must be accessed via a base pointer which moves
> > the same way.
> >
> > I'm curious what your responses are to the several options and
> > workarounds I send in my previous message.  Just to keep things
> > interesting, let's consider changing most "int" values in the API to
> > "long".  This would directly remove the 64-bit problems you mentioned.
> > (We are already performing the necessary long-arithmetic optimizations,
> > because of the existing "long" address parameters.)  Allowing offsets
> > to be long would open the door to non-varying addresses of statics.
> >
> > In VMs with "malloced" statics, staticFieldBase would return null,
> > and fieldOffset would return a pointer value.  The addressing mode
> > [register + offset] would have the effect of [0 + address], so that
> > any fixed virtual address can be probed by supplying a null heap offset.
> >
> > In fact, this could make half of the get/set functions be redundant;
> > everything could be expressed as [base + offset] / [null + address].
> >
> > Comments?
> >
> > -- John
> >
(Review ID: 137588) 
======================================================================

Comments
CONVERTED DATA BugTraq+ Release Management Values COMMIT TO FIX: hopper FIXED IN: hopper INTEGRATED IN: hopper
14-06-2004

EVALUATION Root problem is 32-bit offset values in Unsafe API. It should be upgraded to work uniformly with 64-bit offsets. These offsets occur as arguments to all three-argument putXXX, all two-argument getXXX, and the values returned from fieldOffset and arrayBaseOffset. (Also, pageSize should be made long.) Advantages: - fieldOffset can be used uniformly to access static fields, on either heap. (staticFieldBase returns null if the static is at an absolute address.) - Unsafe can access Java arrays up to 2**31-1 elements. - Can create APIs which efficiently manipulate very large (>4Gb) Java objects. Making such offsets long is no hardship to 32-bit systems, since the methods will just ignore the high 32 bits. Because of support for NIO, etc., the optimizer already knows how to strength-reduce 32-bit address calculations expressed as long arithmetic, so this would not be new functionality. ###@###.### 2002-01-11 Here are representatives of the new methods, with doc strings: /** * Fetches a value from a given Java variable. * More specifically, fetches a field or array element within the given * object <code>o</code> at the given offset, or (if <code>o</code> is null) * from the memory address whose numerical value is the given offset. * <p> * The results are undefined unless one of the following cases is true: * <ul> * <li>The offset was obtained from {@link #objectFieldOffset} on * the {@link java/lang/reflect/Field} of some Java field and the object * referred to by <code>o</code> is of a class compatible with that * field's class. * * <li>The offset and object reference <code>o</code> (either null or * non-null) were both obtained via {@link #staticFieldOffset} * and {@link #staticFieldBase} (respectively) from the * reflective {@link Field} representation of some Java field. * * <li>The object referred to by <code>o</code> is an array, and the offset * is an integer of the form <code>B+N*S</code>, where <code>N</code> is * a valid index into the array, and <code>B</code> and <code>S</code> are * the values obtained by {@link arrayBaseOffset} and {@link arrayIndexScale} * (respectively) from the array's class. The value referred to is the * <code>N</code><em>th</em> element of the array. * * </ul> * <p> * If one of the above cases is true, the call references a specific Java * variable (field or array element). However, the results are undefined * if that variable is not in fact of the type returned by this method. * <p> * This method refers to a variable by means of two parameters, and so * it provides (in effect) a <em>double-register</em> addressing mode * for Java variables. When the object reference is null, this method * uses its offset as an absolute address. This is similar in operation * to methods such as {@link #getInt(long)}, which provide (in effect) a * <em>single-register</em> addressing mode for non-Java variables. * However, because Java variables may have a different layout in memory * from non-Java variables, programmers should not assume that these * two addressing modes are ever equivalent. Also, programmers should * remember that offsets from the double-register addressing mode cannot * be portably confused with longs used in the single-register addressing mode. * * @param o Java heap object in which the variable resides, if any, else null * @param offset indication of where the variable resides in a Java heap object, * if any, else a memory address locating the variable statically * @return the value fetched from the indicated Java variable * @exceptions No defined exceptions are thrown, not even {@link NullPointerException}. */ public native int getInt(Object o, long offset); /** * Stores a value into a given Java variable. * <p> * The first two parameters are interpreted exactly as with * {@link #getInt(Object, long)} to refer to a specific * Java variable (field or array element). The given value * is stored into that variable. * <p> * The variable must be of the same type as the method * parameter <code>x</code>. * * @param o Java heap object in which the variable resides, if any, else null * @param offset indication of where the variable resides in a Java heap object, * if any, else a memory address locating the variable statically * @param x the value to store into the indicated Java variable * @exceptions No defined exceptions are thrown, not even {@link NullPointerException}. */ public native void putInt(Object o, long offset, int x); /** * Report the location of a given static field, in conjunction with {@link #staticFieldBase}. * <p>Do not expect to perform any sort of arithmetic on this offset; * it is just a cookie which is passed to the unsafe heap memory accessors. * * <p>Any given field will always have the same offset, and no two distinct * fields of the same class will ever have the same offset. * * @see #getInt(Object, long) */ public native long objectFieldOffset(Field f); /** * Report the location of a given field in the storage allocation of its * class. Do not expect to perform any sort of arithmetic on this offset; * it is just a cookie which is passed to the unsafe heap memory accessors. * * <p>Any given field will always have the same offset, and no two distinct * fields of the same class will ever have the same offset. * * @see #getInt(Object, long) */ public native long staticFieldOffset(Field f); /** * Report the location of a given static field, in conjunction with {@link #staticFieldOffset}. * <p>Fetch the base "Object", if any, with which static fields of the given * class can be accessed via methods like {@link #getInt(Object, long)}. * This value may be null. * This value may refer to an object which is a "cookie", not * guaranteed to be a real Object, and it should not be used in * any way except as argument to the get and put routines in this * class. */ public native Object staticFieldBase(Field f); ###@###.### 2002-04-22
22-04-2002