Bug ID: JDK-4222792 Please allow Multiple Return Values

JDK-4222792 : Please allow Multiple Return Values

Type: Enhancement
Component: specification
Sub-Component: language
Affected Version: 1.2.0,1.3.0,5.0,6

Priority: P5
Status: Closed
Resolution: Won't Fix
OS: generic,windows_xp
CPU: generic,x86

Submitted: 1999-03-22
Updated: 2005-05-10
Resolved: 2005-05-06

Related Reports

Duplicate :	JDK-4639379 - support multiple return types in the Java language
Relates :	JDK-6573237 - Functions that return multiple values

Description


Name: dbT83986			Date: 03/22/99


Java's second biggest problem, in my opinion, is the lack of support for multiple return values.  I've personally written over 30,000 lines of
Java code so far and have been heading up a large Java development project, so I do have some experience with Java programming.

This lack of multiple return values has been a major pain.  Whenever I want to return more than one result from a method, I am forced to allocate
an object.

The simple
  (mv-let (a b) (foo obj 17)
     ... )

becomes

  class TwoIntValues { int firstValue; int secondValue; }
  ...
  TwoIntValues ab = new TwoIntValues();
  obj.foo(17,ab);
  a = ab.firstValue;
  b = ab.secondValue;
   
or, if you don't want terrible memory waste,

  static myTwoIntValuesBuffer = new TwoIntValues();
  ...
  synchronized(myTwoIntValuesBuffer)
  {
    obj.foo(17,myTwoIntValuesBuffer);
    a = ab.firstValue;
    b = ab.secondValue;
  }

This leads to code bloat, and it becomes quite tricky to declared your buffer objects at appropriate scoping levels and such.  Obviously you would like to eliminate the synchronized() above due to delays.  But any attempt to do so usually results in all sorts of crazy invariants about who owns which buffers, and what to do with them.

Perhaps worse, you end up defining meaningless classes like the above TwoIntValues class.  I can understand geometric methods returning a Point, but why should a specialized divide operation have to return a QuotientAndRemainderIntInt?  And should callers have to cache QuotientAndRemainderIntInt buffers here and there for fast access?  This seems like a bad plan, seeing as all current processors have enough registers to return multiple values.


I can understand why backwards languages like C++ haven't implemented multiple value returns:  You can simulate it by passing a pointer to each return value.  However, this leads to substantial inefficiency.

Compare optimized implementations of these two calls:
  1. obj.foo(17,&a,&b)    // No multiple value return
  2. (a,b) = obj.foo(17)  // Multiple value return

each followed by "return a+b":

1. "obj.foo(17,&a,&b); return a+b;"

    sub sp,8
    mov r1,17
    lea r2,sp[0]
    lea r3,sp[4]
    call class$foo
    mov r1,sp[0]
    mov r2,sp[4]
    add r1,r2
    add sp,8
    ret
  class$foo:
    ... compute r1 & r2 ...
    mov [r4],r1
    mov [r5],r2
    ret

2. "(a,b)=foo(17); return a+b;"
    mov r1,17
    call class$foo
    add r1,r2
    ret
  class$foo:
    .. compute r1 & r2 ...
    ret

Not only is the second expression more readable, but it also allows the generation of much better code!

In C++ we can do ugly Java-like tricks and come up with the more efficient code:

   struct TwoIntValues { int firstValue; int secondValue; };
   ...
   TwoIntValues ab = foo(17);
   a = ab.firstValue;
   b = ab.secondValue;

But this is syntactically ugly and requires the definition of struct TwoIntValues to be in some public place.  How much nicer if we just had an extended Java like this:

  void easy()
  {
    int a;
    int b;
    (a,b)=foo(17);  // Call
    ...
  }
  (int,int) foo(int arg)  // Declaration
  {
    int x=...;
    int y=...;
    return (x,y);   // Return value
  }

This extension seems nice and clean to me and it integrates easily into the existing language.  The only drawback I can see here is that it C++ programmers may think "return (x,y);" returns the value of "y".

An alternative syntax would be:

  ...
  Values(a,b)=foo(17);
  ...
  Values(int,int) foo(int arg)
  {
    ...
    return Values(x,y);
  }

in this circumstance I think it would be smart to press an unused Unicode character into service as a synonym for the "Values" operator, since it really shouldn't be looking like a class name.

I've seen the complicated proposals here to declare some objects to be a "primitive" type so that they can be returned in registers, but these proposals introduce new, complex syntaxes and introduce a fundamental change to the Java object model that I consider inappropriate.  They are also unsuitable for the task.  They also fail to simplify the syntax or remove the need to declare "QuotientAndRemainderIntInt"-type classes.

One other bit of syntactic convenience I recommend is to allow declarations inside the "Values" clause:

   void easy()
   {
     Values(int a, int b) = foo(17);
     ...
   }

Conclusion:

Multiple return values are quite useful in any language, and with Java's object model they are essential to generating efficient and readable code.  No other construct I know of can take their place.

They would be easy to implement.  Even without suitable VM changes, the extra values could be stored at fixed TLS offsets.  And future versions of the VM and Java native compilers could take advantage of register return values (either by recognizing the use of the "special" TLS slots in the byte code, or by having the compiler generate code based on the 1.3 VM spec, which has this ability built in).

Think of it this way:  Why should a method be allowed to receive 3 arguments but not allowed to return 3 results?  To me, this seems to be a fundamental inconsistency in the language.
(Review ID: 55623)
======================================================================

Comments

EVALUATION I believe this is best dealt with by good implementations. If an object is really being used in a stack discipine, it may be possible to recognize this and allocate it on the stack. I found the example rather spurious. Typically, one can use arrays or generic classes like pairs for this sort of thing, and not worry about code bloat and the like. So the main issue is performance. I don't expect us to change Java to support multiple return values. gilad.bracha@eng 1999-04-09 Adding multiple return values to Java would most likely require extensive incompatible changes to the JVM. Before undertaking such an endeavor, the benefits should be compared to the costs. This feature doesn't seem to be worth the additional complexity. Compare this feature to the features added in JDK 5, generic types, enums, annotations, etc. All of these features are used extensively throughout the entire JDK, whereas this feature would most likely rarely, if ever, be used in the JDK since there are better ways to accomplish the same thing: using real classes to represent (and encapsulate) results of computations. Java is a statically checked language and verbose declarations are preferred over implicit declarations. This is good for at least two reasons: verbosity make the language more accessible for new developers and hopefully makes programs easier to maintain in the long run. The idea is that you only type it once, but will have to read it multiple times. In this sense, Point is preferred over (int,int). ###@###.### 2005-05-06 09:40:08 GMT Some clarification in response to recent SDN comments: This feature is interpreted as a request for multiple return values for performance reasons (claiming to generate more efficient code) or laziness (avoid declaring classes). This request is not interpreted as request for "stack allocated objects", tuple types, record types, or "anonymous classes". All such proposals are more general than multiple return values. Some similar proposals already exists and we always welcome new proposals. Changing the JVM to support multiple return values on the stack (the JVM is a stack machine, without registers) would most likely require incompatible changes to calling conventions and require complex changes to the byte code verifier which could have security implications. As a matter of programming style, this idea is not appealing in a object oriented programming language. Returning objects to represent computation results is *the* idiom for returning multiple values. Some suggest that you should not have to declare classes for unrelated values, but neither should unrelated values be returned from a single method. ###@###.### 2005-05-08 02:52:55 GMT

08-05-2005

WORK AROUND Name: dbT83986 Date: 03/22/99 These can get the job done, but they're a pain: 1. Construct a new "return buffer" object in every caller of the method. (Very inefficient) 2. Construct a new "return buffer" object in the callee. (Even more inefficient, since every call will create a new object) 3. Resort to complex "return buffer" caching. (Results in code bloat and inefficiency because each thread must acquire an appropriate "return buffer" from some common place. 4. Return data in global variables. (Access to global variables must be synchronized, terrible interface) 5. Caller must return callee twice, with a flag set indicating which result should be return. (Inefficient because method executes twice) For the most common workarounds, if a mistake is made and access to the buffer is not properly synchronized, one thread may overwrite another thread's data before it is picked up, resulting in nearly indetectable (but often serious) data errors. ======================================================================

30-09-2004