Duplicate :
|
|
Duplicate :
|
|
Duplicate :
|
|
Duplicate :
|
|
Relates :
|
================================= Name: vi73552 Date: 02/19/99 A Proposal for User-Defined Lightweight Value-Semantics Objects Introduction: ------------- This is a proposal to allow a Java programmer to define new Lightweight Value-Semantics Objects [LVSOs] to the Java language. "Lightweight" means that manipulating the objects is approximately as efficient as manipulating primitive objects. "Value-Semantics" means that the "objects" have value semantics as opposed to the reference semantics of ordinary Java objects. "Objects" means that the declarations whose addition I propose be added adding to the language are similar to class declarations, and you get all of the class options that are reasonable given value semantics. Please note that I propose to allow the user to declare new classes of LVSOs, not to create the notion of LVSOs ab initio. Java already has LVSOs; int, boolean, etc. We should not debate whether we need LVSOs, only whether we should let the user add new LVSO classes. I will develop the proposal in three stages: the basic stage, in which you essentially get to define your primitive types the subtypable primitives stage, in which we explore the implications of inheritance for LVSOs [whose method calls would not be polymorphic, since there are no pointers in Java and no references to LVSOs]. the overloaded operators stage. I won't devote too much attention to this because the Java developers do not plan to introduce operator overloading in the near future. See the resolution of bug #4087427. Some of the ideas for type safety language features come from Ada. Motivation: ----------- Consider the Apples and Oranges classes of Appendix A. These classes have simple behaviors and few instance variables; the motivation for declaring the classes is type safety; the compiler won't let you compare Apples and Oranges. In this case the programmer's intent was to create a new primitive type with int's semantics but with type safety. Consider the Money class and its subclasses of Appendix B. In this case we have a class that implements a non-integral fixed point representation, and two subclasses that are, again, provided for type safety. Lastly, consider the Complex class of Appendix C. The Complex class is a canonical example of a type that is not a primitive type in many languages, but which you would like to be implemented efficiently. The Complex class differs from the other two in the fact that the instance variable set is more than a single instance of one primitive type. These classes could be treated similarly to complicated classes for which the overhead of object creation is amortized over a long object lifetime, but there are several reasons why the programmer might prefer a different treatment for these tiny objects: 1: The objects are small. They don't represent the fruits of much computation, making the high cost of object creation a problem. 2: You probably want value semantics rather than reference semantics. In particular, equals() is likely to do the right thing in more cases than == Because all objects are manipulated with references in Java, and storage for non-primitive objects are managed by the garbage collector and never stack allocated, we must allocate and free objects to get type safety or to get the effects of structures. You would like the compiler to prevent you from comparing Apples and Oranges, but it cost 980 units of time[1] to create an Apple or an Orange, where a simple assignment costs one time unit. In the real world people will be inclined to declare integer variables for discrete units such as Apples and Oranges, and float variables for physical units such as meters or seconds, and from time to time Java programs will have bugs that trace to type mistakes. In the case of complex numbers and other small aggregates of primitive types with simple methods, programmers will have three unappetizing choices: 1: Create the objects and pay the performance penalty 2: Create the objects and reuse them, managing your own storage and having the same dangling reference problem you get in languages like C++ [although there would be no memory leaks] 3: Maintain clusters of local variables; in this example you might have double ac_signal_real; double ac_signal_imag; where you would like to have written Complex ac_signal; this is what you would like the compiler to do for you. One problem with simply creating the objects is that we are then faced with a second choice. We can make the small objects immutable, so for example the sole Complex addition method looks like this Complex add(Complex addend) { return new Complex(real + addend.real, imag + addend.imag); } but efficiency considerations would impel us to at least provide the option of modifying an existing Complex and including the method Complex add_d(Complex addend) { real = real + addend.real; imag = imag + addend.imag; return this; } and the programmer must always keep in mind whether a Complex reference has more than one name at a given point in his program in order to be able to use add_d and avoid the cost of an allocation every time you add. The Basic Proposal; extensions of Primitive ------------------------------------------- When a programmer declares an int in Java [or, for that matter, in almost any other programming language], he allocates a cell able to contain a basic integer in whatever frame he's declaring in. For example, if he declares an int in a method he gets an int in the local variables area of the stack frame of every invocation of that method. If the programmer keeps modifying that int, the same cell is reused. On the other hand, when a programmer declares an object, including a boxed instance of a primitive type such as Integer, he allocates a cell able to hold a reference to an instance. The class designer has to decide whether to make the object updatable, which might reduce the creation of new objects but which will lead to unexpected behavior if there are multiple pointers to the same object, or immutable, in which case the new operator with its performance penalty needs to be performed repeatedly. The boxed primitive types in java.lang.* are immutable. I propose that the programmer be able to declare that a class extends java.lang.Primitive instead of the usual java.lang.Object or one of its subclasses. If he does this, then the declared memory contains the object itself instead of a reference to the object. If a class is derived from Primitive then there is no way to declare a reference to instances of that class. I further propose the following conventions: 1: You may use any visible constructor to create a value of a Primitive extension. You can't apply new to the class name, since Primitive extensions are not allocated on the heap. 2: If you declare a variable of a Primitive extension but provide no initial value the instance variables are built with the default constructor. There must be one, or an uninitialized declaration is forbidden. However, a consequence of rule 1 is that if you elect to initialize your variable you could write something like Complex unit_real_vector = Complex(1.0, 0.0); Of course the constructor must exist and must be visible per its access modifier. 3: The equals method inherited from Primitive recursively applies equal to corresponding instance variables. This can, of course, be overridden but there is no polymorphism because all value types can be determined at compile time. A == B is not allowed if A and B are different Primitive subtypes of if one operand is a Primitive subtype and the other isn't, and is determined by applying A.equals(B) if they are of the same Primitive subtype. 4: Primitive subtypes cannot be declared to implement an interface. A Primitive subtype cannot contain itself as an instance variable. There can be no sequence of Primitive subtypes S1, S2, ... Sn such that S1 contains an S2, S2 contains an S3, ..., S(n-1) contains an Sn and Sn contains an S1. proposed implementation ----------------------- Types come in three flavors: primitive, Primitive subtype, or object. Every Primitive subtype reduces to a sequence of types of the form <Array>^i<plain Java type> where i is a non-negative integer. A <plain Java type> is a primitive type or an Object. The reduction is as follows: 1: a non-array Object or a primitive type reduces to itself 2: a Primitive subtype reduces to the concatenation of the reductions of the types of its instance variables 3: If Type reduces to [ <array>^i1<t1>, <array>^i2<t2>, ... ] then Type[] reduces to [ <array>^i1+1<t1>, <array>^i2+1<t2>, ... ] This is guaranteed to terminate by noncircularity of the instance variable type definitions and the fact that there is a finite number of Primitive subtypes. In practice I do not expect the reduction of a primitive subtype to reduce to more than a handful of types. We call the result of this process the "decomposition" of the Primitive subtype. If a Primitive subtype P decomposes to a single type T, then the compiler shall generate code that allocates a cell that can hold a value of type T for each variable of type P. The compiler shall arrange for the methods to be called on implicit arguments of type T. Byte code for static methods is the way to accomplish this since there is no this for an instance method call. Therefore, the "methods" of a Primitive subtype compile as static methods. If a Primitive subtype P decomposes to a sequence of types T1, T2, ... Tn, then the compiler shall generate code that allocates n cells, one each to hold a value of each of the Ti types, for each variable of type P. The compiler shall arrange for the methods to be called on a sequence of implicit arguments of types Ti. For each of the explicit arguments of a Primitive subtype several formal parameters, one for each of that subtype's Ti's, shall be passed instead. For example, double magnitude(Complex C){ .. real .. imag .. } compiles into byte code appropriate for static double magnitude(double real, double imag){ .. real .. imag .. } Do note that an array of Complex decomposes to two arrays, each of double. Because the byte code fundamentally can't return multiple values [as in the LISP style], return values of Primitive subtypes which decompose to multiple types must be returned in some other manner. I propose hidden formal parameters to methods returning such a value. The compiler shall generate one mutable boxed primitive type instance for each primitive value returned as a Primitive subtype, except for the first type T1 which can be returned as the return value. [Owen, help me word this?] Although these boxed types must be allocated, they only need to be allocated once per calling method invocation at worst, and in some cases a method would be able to pass some of the hidden boxed primitive types to its callees and therefore avoid extra allocations. Another possibility is to have instance variables on Thread that maintain per-Thread lists of mutable boxed primitive type instances for this purpose. So Complex add(Complex addend){ ... } compiles as if it were static double add(boxed_double imag_ret, double real, double imag, double addend_real, double addend_imag) { ... } There is one other small detail. The == operator must generate a call to equals() when the operands are compatible LVSOs. Extending Primitive subtypes: ----------------------------- I do not propose that type information be kept with LVSOs. You may therefore not assign a DollarsAndCents to a Money. However, inheritance is useful even without polymorphism. I propose the Ada convention that an operation on a supertype be extended to the subtype by replacing the supertype by the subtype everywhere in the signature, including the result type. An operation will not be extended if any part of the signature is already of the subtype, or if the result is of the supertype and the subtype adds instance variables. Overloaded Operators: --------------------- It doesn't look like operator overloading will be added to Java any time soon. However, if it ever is added, say with C++ syntax, you might want to be able to engage an operation on its right-hand operand. The C++ convention of declaring a variant of a method applicable to a different syntax by adding a bogus int formal parameter may serve us well here. For example, scalar multiplication of a Complex might look like this: Complex operation*(double scalar) { return Complex(real * scalar, imag * scalar); } Complex operation*(double scalar, int i) { return Complex(real * scalar, imag * scalar); } ... Complex C(1.0, 2.0) ... C * 3.0 // engages Complex operation*(double scalar) ... 3.0 * C // engages Complex operation*(double scalar, int i) Appendix A public final class Apples { private int number; public Apples(int i) { number = i; } public Apples plus(Apples addend) { return Apples(number + addend.number); } // you get this one for free; it's placed here for emphasis public boolean equals(Apples comparand) { return number == comparand.number; } ... } public final class Oranges { private int number; public Oranges(int i) { number = i; } public Oranges plus(Oranges addend) { return Oranges(number + addend.number); } // you get this one for free; it's placed here for emphasis public boolean equals(Oranges comparand) { return number == comparand.number; } ... } Appendix B public class Money { private long cents; public Money(double amount) { cents = amount * 100 + 0.1; } private Money(long amount) { cents = amount; } public Money add(Money addend) { return Money(cents + addend.cents); } public boolean equals(Money comparand) { return cents = addend.cents; } ... } public final class DollarsAndCents extends Money ... public final class Euros extends Money ... Appendix C public final class Complex { private double real; private double imag; public Complex(double _real, double _imag) { real = _real; imag = _imag; } public Complex add(Complex addend) { return Complex(real + addend.real, imag + addend.imag); } public boolean equals(Complex comparand) { return real == comparand.real && imag = comparand.imag; } ... } notes [1] Bruce Eckels, "Thinking in Java", Prentice Hall, 1998, Pp. 1060-1061 These timings were performed on Sun's JDK1.1.4 on a Pentium Pro 200MHz. (Review ID: 48962) =====================================
|