JDK-4474171 : RFE: LTP: Persistence Delegation: design flaws, side effects, perhaps specification flaw
  • Type: Enhancement
  • Component: client-libs
  • Sub-Component: java.beans
  • Affected Version: 1.4.0
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: generic
  • CPU: generic
  • Submitted: 2001-06-25
  • Updated: 2007-10-23
Description
Name: krC82822			Date: 06/25/2001


java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)

The design of the classes related to the new persistence delegation
seems counterproductive and does not match specification at
some points. Some methods have undocumented side effects.

The Encoder class is intended to be a base class for different
Encoder implementations but is mostly designed as if it should never
be subclassed and implemented as if XMLEncoder was the only possible
subclass and if it were a singleton design.

Most important points are:

- Encoder should be abstract
- Encoder should provide abstract methods for the implementors
  e.g., a method which is called after writeStatement has
  processed it and has created the nessessary expressions
- The logic of maintaining a statement list and value to expression
  mapping as well as looking up which expressions are really needed
  by the written statements should be provided by the Encoder not
  implemented in XMLEncoder and re-implemented by all other Encoder
  implementations
  90% of the XMLEncoder code is maintaining these informations instead
  of dealing with XML encoding what the class actually is for
- getPersistenceDelegate and setPersistenceDelegate are instance
  methods but have static effect. These should be either static methods
  or have an effect per instance
- getValue(Expression) should be protected not package-private
- clear() should be protected not package-private
- XMLEncoder should not call any package-private methods.
  Then it would better match it's role of a reference
  implementation of an Encoder and the design flaws would
  be detected earlier
- The constructor of XMLEncoder has global side-effects.
  BTW the called static method setCaching(boolean) is replacing the cache
  by a new HashMap every time it is set to true, even if it was true before
- The flush() method of XMLEncoder has global side-effects
- The toString() method of Statement has global side-effects
  The name assigned to the target of the statement is stored in a global
  map. This map is cleared in the flush() method of XMLEncoder. This means:
  - calling flush() on a XMLEncoder causes other Encoders to output
    inconsistent names if Statement.toString() is used.
  - if other Encoder implementations are using the toString() method of
    Statement the target objects are never garbage-collected
- There might be more side-effects

It seems as if the whole code should be better completly rewritten.

The following code can be used to indicate some of the above points.


import java.lang.reflect.*;
import java.beans.*;
import java.io.ByteArrayOutputStream;
import java.util.HashMap;

class TestCase {
  static void main(String[] arg) {
    try {
      System.out.print("Is getPersistenceDelegate DECLARED static ? ");
      Class[] argType=new Class[] {Class.class};
      Method m=Encoder.class.getMethod("getPersistenceDelegate",argType);
      System.out.println(Modifier.isStatic(m.getModifiers()));

      System.out.print("Is the EFFECT of set/getPersistenceDelegate static ? ");
      Encoder e1=new XMLEncoder(new ByteArrayOutputStream()),
              e2=new XMLEncoder(new ByteArrayOutputStream());
      PersistenceDelegate pd=new DefaultPersistenceDelegate();
      e1.setPersistenceDelegate(TestCase.class, pd);
      System.out.println(e2.getPersistenceDelegate(TestCase.class)==pd);

      System.out.println("Writing an object to a debug Encoder");
      e1=new Encoder() {
        public void writeExpression(Expression ex) {
          super.writeExpression(ex);
          System.out.println("  Expression: "+ex);
        }
      };
      Object testObj=new java.awt.Point(4,2);
      e1.getPersistenceDelegate(testObj.getClass()).writeObject(testObj,e1);

      System.out.print("now, are there DANGLING REFERENCES avoiding gc ? ");
      Class nameCl=Class.forName("java.beans.NameGenerator");
      Field nameFd=nameCl.getDeclaredField("valueToName");
      nameFd.setAccessible(true);
      HashMap names=(HashMap)nameFd.get(null);
      System.out.println(!names.isEmpty());
      System.out.println("  "+names.size()+" objects");

      if(!names.isEmpty()) {
        System.out.print("Does XMLEncoder.flush() have a GLOBAL SIDE-EFFECT"
          +" to these references ? ");
        ((XMLEncoder)e2).flush();
        HashMap names2=(HashMap)nameFd.get(null);
        System.out.println(!names2.equals(names));
        System.out.println("  "+(names2.isEmpty()?"now":"still")
          +" contains "+names2.size()+" objects");
      }
    }
    catch(Exception ex) {
      System.out.println("\nException occured: "+ex);
      System.out.println("Note that this TestCase is specific to version "
        +"1.4.0-beta-b65");
    }
  }
}
(Review ID: 127273) 
======================================================================

Comments
WORK AROUND Name: krC82822 Date: 06/25/2001 Do not use multiple XMLEncoder instances at the same time. Do not create direct instances of Encoder as it makes no sense anyway. Avoid using get/setPersistenceDelegate until their specification and implementation match. If you are implementing an Encoder I suggest using an intermediate class in the inheritance hierarchy between Encoder and your implementation which implements the features Encoder is missing and that are not really specific to your implementation. This way you can easily adapt your implementation to a future version of Encoder which has the missing features. Do not use Statement.toString() for other than debugging purposes. If you are using custom Encoder implementations and want to make sure that object references are cleared afterwards, create a XMLEncoder instance and call it's flush() method. ======================================================================
2004-06-11

EVALUATION Response from Philip Milne follows: > The Encoder class is intended to be a base class for different > Encoder implementations but is mostly designed as if it should never > be subclassed and implemented as if XMLEncoder was the only possible > subclass and if it were a singleton design. > This is a little harsh, the Encoder can in fact be trivially subclassed to produce a textual (Java-like) output. Granted however, some package private methods make it very difficult to override some of its features. Where the problems are a question of exposing methods that are currently hidden this will probably happen, though it is too late for changes that are not backward compatible. Its a shame we didn't have you on the expert group when you could have raised these issues while the design was taking place last year. Anyway, to answer your points: > Most important points are: > > - Encoder should be abstract I'm actually not sure this is better, but its too late for this anyway as this would break people. > - Encoder should provide abstract methods for the implementors > e.g., a method which is called after writeStatement has > processed it and has created the nessessary expressions As above. > - The logic of maintaining a statement list and value to expression > mapping as well as looking up which expressions are really needed > by the written statements should be provided by the Encoder not > implemented in XMLEncoder and re-implemented by all other Encoder > implementations > 90% of the XMLEncoder code is maintaining these informations instead > of dealing with XML encoding what the class actually is for Well, although I wouldn't claim the design is perfect, there is some method to this madness. The Encoder does actually have all the state it needs to emit a stream of expressions and statements which will produce the object graph. The XMLEncoder is performing an inherently more complicated task. It is the structure of the XML and the fact that expressions are nested in lexical blocks with all operations on an instance done in one place that requires the extra data structures. In particular, deciding whether am object needs an explicit ID is quite difficult to calculate from the graph and requires access to the all the state which is found in the XMLEncoder. > - getPersistenceDelegate and setPersistenceDelegate are instance > methods but have static effect. These should be either > static methods > or have an effect per instance Yes, this has been reported as a problem a few times already. There is a fundamental mismatch here, between the idioms which the Encoder/Decoder take from the IO package and from the rest of the beans specification. Basically, a the IO package introduces the idea of streams which have state - and so we attached the delegates that we use for controlling persistence to the stream so that we would be able to support different configurations of encoders at the same time at some point in the future. Had we made them static this would never have been possible. The beans spec and, in particular, the Introspector attach BeanInfo to classes and this is an inherently static idea. After worrying about this for some time I made the methods instance methods because that left us greatest room for expansion in the future and allowed them to be subclassed. As you have rightly pointed out they currently affect static data structures in the MetaData class and this is very confusing. The structures could be made instance specific though there is always going to be a mismatch between this and BeanInfo which is, and always will be, a static concept. > - getValue(Expression) should be protected not package-private > - clear() should be protected not package-private Yes this is a good point though once methods are exposed they can never be deleted; even deprecating methods is pretty difficult to do in organisations which create public libraries. Given this, I erred on the side of caution with the API and kept everything hidden that could be hidden on the grounds that we could expose things if people asked us to and not the other way around. You're right that these methods should be made protected at some point at that was always the intention. > - XMLEncoder should not call any package-private methods. > Then it would better match it's role of a reference > implementation of an Encoder and the design flaws would > be detected earlier The XMLEncoder is not really intended to be a reference implementation for the Encoder. The real purpose of the Encoder is to provide an API which the persistence delegates can call - thereby isolating the persistence delegates from the encoding. That it can be subclassed to make other encoders is (or would be with the changes you suggest) a useful thing but it is not the main reason for the class. FYI: There were thoughts of making this class an interface for exactly the reason above but interfaces cannot be extended over time - and that is the main reason it is a class. Its API is important, not its implementation here. That said, it does actually work if you instantiate it and implement the write() methods using toString(). There's actually very little (a couple of pages I think) in the Encoder implementation and so a start-from-scratch implementation might be your best bet in the short term. > - The constructor of XMLEncoder has global side-effects. Yes, though, once again, writing BeanInfo for a class has a global effect. > BTW the called static method setCaching(boolean) is replacing the > cache > by a new HashMap every time it is set to true, even if it was true > before Yes, though this happens rarely enough that it did not show up at all in our performance tests. > - The flush() method of XMLEncoder has global side-effects True, in the NameGenerator. This should be fixed though it only affects the toString() method which is there for debugging purposes. > - The toString() method of Statement has global side-effects As above. > The name assigned to the target of the statement is stored > in a global > map. This map is cleared in the flush() method of XMLEncoder. This > means: > - calling flush() on a XMLEncoder causes other Encoders to output > inconsistent names if Statement.toString() is used. Good point. I must admit I hadn't paid enough attention to the scenario in which multiple streams are being used to write the same object graph to different sinks. I think all of this can be fixed by making the NameGenerator and co. instance specific thispart of the implementation is all private. > - if other Encoder implementations are using the toString() > method of > Statement the target objects are never garbage-collected You mean because the flush method is package private? Yes, this is also true. > - There might be more side-effects > > It seems as if the whole code should be better completly rewritten. > As I say there is very little to the Encoder, most of the work is in the XMLEncoder and that is there to support a format that you obviously don't need. I would just do what you say, subclass the Encoder override all the public methods and replace the two pages or so of code there with your own implementation. Its early days for this API and you are the first person I know of to be trying to produce an entirely new format. If you'd like to work with Mark to help get all of the features you need exposed, so that others people would be able to use the API to produce other formats more easily, then let him know. I expect, if he has time, he'll try to get them in for FCS. mark.davidson@Eng 2001-07-25 Many of the issues raised about multi threaded access to the static caches maintained in NameGenerator and Statement has been resolved as part of the fix for 4880633. ###@###.### 2003-09-22
2003-09-22