JDK-4945726 : rmid transient state may become out of sync with persistent state
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.rmi
  • Affected Version: 1.2.1
  • Priority: P4
  • Status: Open
  • Resolution: Unresolved
  • OS: solaris_8
  • CPU: sparc
  • Submitted: 2003-10-29
  • Updated: 2013-04-12
Related Reports
Relates :  
Description
An update to rmid's persistent state may fail, due to an I/O problem for 
example, when either rmid writes a log record or rmid records a snapshot of its
current transient state.  If this happens, the update should not be reflected
in the transient state (which currently happens because the transient state is
updated before the log record is written), since this update will be present in 
the next snapshot even if writing the log record failed.

Instead, rmid could write the log record first and then only make the update
to the transient state if the log update succeeds.  It is important that
the snapshot is taken before the log record is written because if it happens
in the reverse (log record written before snapshot), the transient update
will not be present in the snapshot (because it happens afterward) and the log
record will be discarded along with the snapshot, losing any information of the 
update.

Comments
EVALUATION Concurrency is another potential cause of log inconsistency. An Activation object contains data structures such as groupTable and idTable. As of the fix for 6896297 they are ConcurrentHashMaps so they are themselves safe, but there is apparently no locking done to ensure that updates to these data structures are made atomically. When a log snapshot occurs, this may serialize the Activation object while its component data structures are in the midst of being updated, so the log may end up containing a transient and possibly inconsistent state. The Jini "Phoenix" version of Activation may have addressed this issue as well.
21-04-2011

EVALUATION Yes, this should be fixed, although probably not for Tiger. ###@###.### 2003-10-31 Note that the exposure doesn't seem to be quite like is implied in the Description, because if a log update fails, rmid immediately attempts a snapshot, and if that fails, rmid initiates shutdown of itself, and the method that originated the state change throws an ActivationException-- so the exposure only seems to be the possible momentary visibility of new transient state that does not get persisted before the shutdown (which could also occur simply because of a machine crash). See the com.sun.jini.phoenix.Activation implementation for an example of the suggested alternative implementation that attempts to only update the transient state after the persistent state has been updated (although it doesn't actually follow that order in all cases-- the exceptions are for LogUnregisterGroup and LogGroupIncarnation). ###@###.### 2005-06-21 22:32:33 GMT
21-06-2005