United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
JDK-4474171 : RFE: LTP: Persistence Delegation: design flaws, side effects, perhaps specification flaw

Details
Type:
Enhancement
Submit Date:
2001-06-25
Status:
Open
Updated Date:
2007-10-23
Project Name:
JDK
Resolved Date:
Component:
client-libs
OS:
generic
Sub-Component:
java.beans
CPU:
generic
Priority:
P4
Resolution:
Unresolved
Affected Versions:
1.4.0
Targeted Versions:

Related Reports

Sub Tasks

Description
Name: krC82822			Date: 06/25/2001


java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)

The design of the classes related to the new persistence delegation
seems counterproductive and does not match specification at
some points. Some methods have undocumented side effects.

The Encoder class is intended to be a base class for different
Encoder implementations but is mostly designed as if it should never
be subclassed and implemented as if XMLEncoder was the only possible
subclass and if it were a singleton design.

Most important points are:

- Encoder should be abstract
- Encoder should provide abstract methods for the implementors
  e.g., a method which is called after writeStatement has
  processed it and has created the nessessary expressions
- The logic of maintaining a statement list and value to expression
  mapping as well as looking up which expressions are really needed
  by the written statements should be provided by the Encoder not
  implemented in XMLEncoder and re-implemented by all other Encoder
  implementations
  90% of the XMLEncoder code is maintaining these informations instead
  of dealing with XML encoding what the class actually is for
- getPersistenceDelegate and setPersistenceDelegate are instance
  methods but have static effect. These should be either static methods
  or have an effect per instance
- getValue(Expression) should be protected not package-private
- clear() should be protected not package-private
- XMLEncoder should not call any package-private methods.
  Then it would better match it's role of a reference
  implementation of an Encoder and the design flaws would
  be detected earlier
- The constructor of XMLEncoder has global side-effects.
  BTW the called static method setCaching(boolean) is replacing the cache
  by a new HashMap every time it is set to true, even if it was true before
- The flush() method of XMLEncoder has global side-effects
- The toString() method of Statement has global side-effects
  The name assigned to the target of the statement is stored in a global
  map. This map is cleared in the flush() method of XMLEncoder. This means:
  - calling flush() on a XMLEncoder causes other Encoders to output
    inconsistent names if Statement.toString() is used.
  - if other Encoder implementations are using the toString() method of
    Statement the target objects are never garbage-collected
- There might be more side-effects

It seems as if the whole code should be better completly rewritten.

The following code can be used to indicate some of the above points.


import java.lang.reflect.*;
import java.beans.*;
import java.io.ByteArrayOutputStream;
import java.util.HashMap;

class TestCase {
  static void main(String[] arg) {
    try {
      System.out.print("Is getPersistenceDelegate DECLARED static ? ");
      Class[] argType=new Class[] {Class.class};
      Method m=Encoder.class.getMethod("getPersistenceDelegate",argType);
      System.out.println(Modifier.isStatic(m.getModifiers()));

      System.out.print("Is the EFFECT of set/getPersistenceDelegate static ? ");
      Encoder e1=new XMLEncoder(new ByteArrayOutputStream()),
              e2=new XMLEncoder(new ByteArrayOutputStream());
      PersistenceDelegate pd=new DefaultPersistenceDelegate();
      e1.setPersistenceDelegate(TestCase.class, pd);
      System.out.println(e2.getPersistenceDelegate(TestCase.class)==pd);

      System.out.println("Writing an object to a debug Encoder");
      e1=new Encoder() {
        public void writeExpression(Expression ex) {
          super.writeExpression(ex);
          System.out.println("  Expression: "+ex);
        }
      };
      Object testObj=new java.awt.Point(4,2);
      e1.getPersistenceDelegate(testObj.getClass()).writeObject(testObj,e1);

      System.out.print("now, are there DANGLING REFERENCES avoiding gc ? ");
      Class nameCl=Class.forName("java.beans.NameGenerator");
      Field nameFd=nameCl.getDeclaredField("valueToName");
      nameFd.setAccessible(true);
      HashMap names=(HashMap)nameFd.get(null);
      System.out.println(!names.isEmpty());
      System.out.println("  "+names.size()+" objects");

      if(!names.isEmpty()) {
        System.out.print("Does XMLEncoder.flush() have a GLOBAL SIDE-EFFECT"
          +" to these references ? ");
        ((XMLEncoder)e2).flush();
        HashMap names2=(HashMap)nameFd.get(null);
        System.out.println(!names2.equals(names));
        System.out.println("  "+(names2.isEmpty()?"now":"still")
          +" contains "+names2.size()+" objects");
      }
    }
    catch(Exception ex) {
      System.out.println("\nException occured: "+ex);
      System.out.println("Note that this TestCase is specific to version "
        +"1.4.0-beta-b65");
    }
  }
}
(Review ID: 127273) 
======================================================================

                                    

Comments
EVALUATION

Response from Philip Milne follows:

> The Encoder class is intended to be a base class for different
> Encoder implementations but is mostly designed as if it should never
> be subclassed and implemented as if XMLEncoder was the only possible
> subclass and if it were a singleton design.
>

This is a little harsh, the Encoder can in fact be trivially subclassed
to produce a textual (Java-like) output. Granted however, some package
private methods make it very difficult to override some of its features.
Where the problems are a question of exposing methods that are currently
hidden this will probably happen, though it is too late for changes
that are not backward compatible. Its a shame we didn't have you on the
expert group when you could have raised these issues while the design was
taking place last year.

Anyway, to answer your points:

> Most important points are:
>
> - Encoder should be abstract

I'm actually not sure this is better, but its too late for
this anyway as this would break people.

> - Encoder should provide abstract methods for the implementors
>   e.g., a method which is called after writeStatement has
>   processed it and has created the nessessary expressions

As above.

> - The logic of maintaining a statement list and value to expression
>   mapping as well as looking up which expressions are really needed
>   by the written statements should be provided by the Encoder not
>   implemented in XMLEncoder and re-implemented by all other Encoder
>   implementations
>   90% of the XMLEncoder code is maintaining these informations instead
>   of dealing with XML encoding what the class actually is for

Well, although I wouldn't claim the design is perfect, there is
some method to this madness. The Encoder does actually have all the
state it needs to emit a stream of expressions and statements which
will produce the object graph. The XMLEncoder is performing an inherently
more complicated task. It is the structure of the XML and the fact that
expressions are nested in lexical blocks with all operations on an
instance done in one place that requires the extra data structures.
In particular, deciding whether am object needs an explicit ID is
quite difficult to calculate from the graph and requires access to
the all the state which is found in the XMLEncoder.

> - getPersistenceDelegate and setPersistenceDelegate are instance
>   methods but have static effect. These should be either
> static methods
>   or have an effect per instance

Yes, this has been reported as a problem a few times already.
There is a fundamental mismatch here, between the idioms which
the Encoder/Decoder take from the IO package and from the rest
of the beans specification. Basically, a the IO package introduces
the idea of streams which have state - and so we attached the
delegates that we use for controlling persistence to the stream so
that we would be able to support different configurations of encoders
at the same time at some point in the future. Had we made them static
this would never have been possible.

The beans spec and, in particular, the Introspector attach BeanInfo
to classes and this is an inherently static idea. After worrying about
this for some time I made the methods instance methods because
that left us greatest room for expansion in the future and allowed
them to be subclassed. As you have rightly pointed out they currently
affect static data structures in the MetaData class and this is
very confusing. The structures could be made instance specific though
there is always going to be a mismatch between this and BeanInfo
which is, and always will be, a static concept.

> - getValue(Expression) should be protected not package-private
> - clear() should be protected not package-private

Yes this is a good point though once methods are exposed they
can never be deleted; even deprecating methods is pretty difficult
to do in organisations which create public libraries. Given this,
I erred on the side of caution with the API and kept everything hidden
that could be hidden on the grounds that we could expose things if
people asked us to and not the other way around. You're right that
these methods should be made protected at some point at that was
always the intention.

> - XMLEncoder should not call any package-private methods.
>   Then it would better match it's role of a reference
>   implementation of an Encoder and the design flaws would
>   be detected earlier

The XMLEncoder is not really intended to be a reference implementation
for the Encoder. The real purpose of the Encoder is to provide an
API which the persistence delegates can call - thereby isolating
the persistence delegates from the encoding. That it can be subclassed
to make other encoders is (or would be with the changes you suggest)
a useful thing but it is not the main reason for the class. FYI: There
were thoughts of making this class an interface for exactly the reason
above but interfaces cannot be extended over time - and that is the main
reason it is a class. Its API is important, not its implementation here.
That said, it does actually work if you instantiate it and implement the
write() methods using toString(). There's actually very little
(a couple of pages I think) in the Encoder implementation and so a
start-from-scratch implementation might be your best bet in the short
term.

> - The constructor of XMLEncoder has global side-effects.

Yes, though, once again, writing BeanInfo for a class has a global
effect.

>   BTW the called static method setCaching(boolean) is replacing the
> cache
>   by a new HashMap every time it is set to true, even if it was true
> before

Yes, though this happens rarely enough that it did not show up
at all in our performance tests.

> - The flush() method of XMLEncoder has global side-effects

True, in the NameGenerator. This should be fixed though it only
affects the toString() method which is there for debugging
purposes.

> - The toString() method of Statement has global side-effects

As above.

>   The name assigned to the target of the statement is stored
> in a global
>   map. This map is cleared in the flush() method of XMLEncoder. This
> means:
>   - calling flush() on a XMLEncoder causes other Encoders to output
>     inconsistent names if Statement.toString() is used.

Good point. I must admit I hadn't paid enough attention to the scenario
in which multiple streams are being used to write the same object graph
to different sinks. I think all of this can be fixed by making the
NameGenerator and co. instance specific thispart of the implementation
is all private.

>   - if other Encoder implementations are using the toString()
> method of
>     Statement the target objects are never garbage-collected

You mean because the flush method is package private? Yes, this
is also true.

> - There might be more side-effects
>
> It seems as if the whole code should be better completly rewritten.
>

As I say there is very little to the Encoder, most of the work is
in the XMLEncoder and that is there to support a format that you
obviously don't need.  I would just do what you say, subclass the
Encoder override all the public methods and replace the two pages
or so of code there with your own implementation.

Its early days for this API and you are the first person I know
of to be trying to produce an entirely new format. If you'd
like to work with Mark to help get all of the features you need
exposed, so that others people would be able to use the API to
produce other formats more easily, then let him know. I expect, if
he has time, he'll try to get them in for FCS.

mark.davidson@Eng 2001-07-25

Many of the issues raised about multi threaded access to the static caches maintained in NameGenerator and Statement has been resolved as part of the fix for 4880633. 

###@###.### 2003-09-22
                                     
2003-09-22
WORK AROUND



Name: krC82822			Date: 06/25/2001


Do not use multiple XMLEncoder instances at the same time.
Do not create direct instances of Encoder as it makes no sense anyway.
Avoid using get/setPersistenceDelegate until their specification and
 implementation match.
If you are implementing an Encoder I suggest using an intermediate class
 in the inheritance hierarchy between Encoder and your implementation
 which implements the features Encoder is missing and that are not really
 specific to your implementation.
 This way you can easily adapt your implementation to a future version of
 Encoder which has the missing features.
Do not use Statement.toString() for other than debugging purposes. If
 you are using custom Encoder implementations and want to make sure
 that object references are cleared afterwards, create a XMLEncoder
 instance and call it's flush() method.
======================================================================
                                     
2004-06-11



Hardware and Software, Engineered to Work Together