JDK-6553303 : Corba application fails w/ org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 203 completed: No
  • Type: Bug
  • Component: other-libs
  • Sub-Component: corba:orb
  • Affected Version: 5.0u11
  • Priority: P1
  • Status: Resolved
  • Resolution: Fixed
  • OS: solaris_9
  • CPU: sparc
  • Submitted: 2007-05-03
  • Updated: 2010-07-29
  • Resolved: 2007-09-26
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 JDK 7
5.0u14 b02Fixed 6u4Fixed 7Fixed
Description
Beginning with Java 1.5.0 customers report to run into CORBA exceptions of the 
following kind:

Mar 30, 2007 1:12:12 PM com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl <init>
WARNING: "IOP00410201: (COMM_FAILURE) Connection failure: socketType: IIOP_CLEAR_TEXT; hostname: 47.142.120.39; port: 37220"
org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 201  completed: No
        at com.sun.corba.se.impl.logging.ORBUtilSystemException.connectFailure(ORBUtilSystemException.java:2172)
        at com.sun.corba.se.impl.logging.ORBUtilSystemException.connectFailure(ORBUtilSystemException.java:2193)
        at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.<init>(SocketOrChannelConnectionImpl.java:205)
        at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.<init>(SocketOrChannelConnectionImpl.java:218)
        at com.sun.corba.se.impl.transport.SocketOrChannelContactInfoImpl.createConnection(SocketOrChannelContactInfoImpl.java:101)
        at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.beginRequest(CorbaClientRequestDispatcherImpl.java:152)
        at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.request(CorbaClientDelegateImpl.java:118)
        at org.omg.CORBA.portable.ObjectImpl._request(ObjectImpl.java:431)
        at com.nortelnetworks.fwcomp_if.common.coseventcomm._PushConsumerStub.push(_PushConsumerStub.java:18)
        at com.nortelnetworks.fwcomp.server.corbabase.es.ProxyPushSupplierImpl.forwardEvent(ProxyPushSupplierImpl.java:231)
        at com.nortelnetworks.fwcomp.common.events.notification.EFQ.run(EFQ.java:181)
 Caused by: java.net.ConnectException: Connection timed out
        at sun.nio.ch.Net.connect(Native Method)
        at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:464)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:146)
        at com.sun.corba.se.impl.transport.DefaultSocketFactoryImpl.createSocket(DefaultSocketFactoryImpl.java:60)
        at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.<init>(SocketOrChannelConnectionImpl.java:188)
        ... 8 more


 After some repetitions of the warning, the application sees the following error:

Mar 30 13:30:56 
 org.omg.CORBA.COMM_FAILURE:   vmcid: SUN  minor code: 203  completed: No
 at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(ORBUtilSystemException.java:2231)
 at com.sun.corba.se.impl.logging.ORBUtilSystemException.writeErrorSend(ORBUtilSystemException.java:2249)
 at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.sendWithoutLock(SocketOrChannelConnectionImpl.java:1042)
 at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendFragment(BufferManagerWriteStream.java:82)
 at com.sun.corba.se.impl.encoding.BufferManagerWriteStream.sendMessage(BufferManagerWriteStream.java:96)
 at com.sun.corba.se.impl.encoding.CDROutputObject.finishSendingMessage(CDROutputObject.java:144)
 at com.sun.corba.se.impl.protocol.CorbaMessageMediatorImpl.finishSendingRequest(CorbaMessageMediatorImpl.java:247)
 at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete1(CorbaClientRequestDispatcherImpl.java:342)
 at com.sun.corba.se.impl.protocol.CorbaClientRequestDispatcherImpl.marshalingComplete(CorbaClientRequestDispatcherImpl.java:323)
 at com.sun.corba.se.impl.protocol.CorbaClientDelegateImpl.invoke(CorbaClientDelegateImpl.java:129)
 at org.omg.CORBA.portable.ObjectImpl._invoke(ObjectImpl.java:457)
 at com.nortelnetworks.fwmgmt_if.common.SnmpIf._AccessorStub.get(_AccessorStub.java:218)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPSession$1.sendRequest(SNMPSession.java:544)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPSession.sendRequest(SNMPSession.java:1269)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPSession.get(SNMPSession.java:550)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPSession.get(SNMPSession.java:410)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPAdaptor.get(SNMPAdaptor.java:206)
 at com.nortelnetworks.mg9kem.server.adaptation.base.Mg5000Adaptor.get(Mg5000Adaptor.java:1326)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPAdaptor.get(SNMPAdaptor.java:188)
 at com.nortelnetworks.mg9kem.server.adaptation.base.Mg5000Adaptor.get(Mg5000Adaptor.java:1311)
 at com.nortelnetworks.fwmgmt.server.adaptation.SNMP.SNMPAdaptor.get(SNMPAdaptor.java:167)
 at com.nortelnetworks.mg9kem.server.adaptation.base.Mg5000Adaptor.get(Mg5000Adaptor.java:1297)
 at com.nortelnetworks.mg9kem.server.adaptation.base.Mg5000NeAdaptor.getEntPhysicalEntry(Mg5000NeAdaptor.java:2348)
 at com.nortelnetworks.mg9kem.server.em.logicalequipment.Mg5kNe.getEntPhysicalEntry(Mg5kNe.java:4484)
 at com.nortelnetworks.mg9kem.server.em.cards.CardOverloadIface.auditData(CardOverloadIface.java:757)
 at com.nortelnetworks.mg9kem.server.em.physicalequipment.Mg5kIntelligentCard$CardAuditNoRetry.execute(Mg5kIntelligentCard.java:7135)
 at com.nortelnetworks.fwmgmt.server.dataaudit.task.AuditTask$PausableThread.run(AuditTask.java:177)
 Caused by: java.nio.channels.ClosedChannelException
 at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
 at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
 at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.write(SocketOrChannelConnectionImpl.java:715)
 at com.sun.corba.se.impl.encoding.CDROutputObject.writeTo(CDROutputObject.java:174)
 at com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.sendWithoutLock(SocketOrChannelConnectionImpl.java:992)
 ... 24 more
 

 The application needs to be restarted in order to cure the problem.

 This behaviour of the application was never seen with 1.4.2.


 Customer found out the following:

 Class com.sun.corba.se.impl.transport.SocketOrChannelConnectionImpl.java 
 uses a different send method depending on type of Socket ( nio vs the 
 old type)

 (i) Customer wrote two custom socket factories, one based on new IO,
     the other on the old socket mechanism (i.e without SocketChannels).
     Customer added a way to get a handle to all the client sockets that 
     were created by these factories, so that they could be closed at 
     will.
 (ii) Customer used each of these factories in their application using 
     -Dcom.sun.CORBA.transport.ORBSocketFactoryClass property and then 
     closed the sockets after some time. Customer found the following :
     (a) With nio sockets, the ORB keeps attempting to reuse the old 
         (closed) socket ad infinitum and throws the COMM_FAILURE with 
         minor code 203 -  this is the behavior seen in their application.
     (b) With the old sockets, the COMM_FAILURE happens once after the 
         socket is closed, and then a new one is created to replace it, 
         and subsequent calls are successful.

 In JDK 1.5, the default socket factory is nio based, so this would 
 explain why dropped/closed TCP connections have not given us any 
 trouble in the past.

 Customer provided the following testcase:

 Please find attached the custom socket factory that customer used to 
 demonstrate the behavior: NortelSocketFactory.java

 It can be used with any corba client (no third party software required).
 Customer ran their corba client with the following commandline switches
 (i)  
   -Dcom.sun.CORBA.transport.ORBSocketFactoryClass=NortelSocketFactory  
   -DUseNio=true
 In (i), the client never recovers.
 (ii) 
   -Dcom.sun.CORBA.transport.ORBSocketFactoryClass=NortelSocketFactory  
   -DUseNio=false
 In (ii) , the client recovers automatically.

Comments
EVALUATION The problem is caused by the lack of a call to purgeCalls in the exception handler of the sendWithoutLock method.
10-07-2007

EVALUATION One part of this problem is clear: if the ORB successfully opens a connection, but sometime later the connection is no longer usable, it gets stuck in the cache. In particular, there is no code in the BufferManagerWriteStream class (where all of the client side writes take place) to remove a connection from the cache if a failure occurs in SocketOrChannelConnectionImpl.sendWithoutLock. This clearly needs to be added. I wonder if the first observed error has anything to do with the second. The first case is what normally happens when attempting to connect to an address that is not reachable for some reason: we handle the exception, and no connection is stored in the cache. Perhaps the failure of the connection occurs later in some cases than others?
29-06-2007