Applied Materials has a major customer experienced an issue in their semiconductor production fab when one of java processes had to be restarted because of com.sun.corba.se.impl.encoding.BufferManagerReadStream underflow This is on Solaris 10, Java 1.6.0_16. After this happens communication between those processes does not succeed. After restarting the process that logs the failure communication resumes normally. It used to work on 1.4 and 1.5 Having checked with the code and CR#6372405, we had introduced private long FRAGMENT_TIMEOUT = 60000 in 6u10 onwards. The old implementation was waited indefinitely. Dec 23, 2010 11:38:17 AM com.sun.corba.se.impl.encoding.BufferManagerReadStream underflow WARNING: "IOP00410217: (COMM_FAILURE) Timeout while reading data in buffer manager" org.omg.CORBA.COMM_FAILURE: vmcid: SUN minor code: 217 completed: No at com.sun.corba.se.impl.logging.ORBUtilSystemException.bufferReadManagerTimeout(Unknown Source) at com.sun.corba.se.impl.logging.ORBUtilSystemException.bufferReadManagerTimeout(Unknown Source) There is a 60 second timeout when waiting for the data. Dec 23, 2010 11:38:17 AM com.sun.corba.se.impl.interceptors.PIHandlerImpl peekClientRequestInfoImplStack WARNING: "IOP00710817: (INTERNAL) Assertion failed: client request info stack is null" org.omg.CORBA.INTERNAL: vmcid: SUN minor code: 817 completed: No at com.sun.corba.se.impl.logging.InterceptorsSystemException.clientInfoStackNull(Unknown Source) at com.sun.corba.se.impl.logging.InterceptorsSystemException.clientInfoStackNull(Unknown Source) at com.sun.corba.se.impl.interceptors.PIHandlerImpl.peekClientRequestInfoImplStack(Unknown Source) The cu asked why is the client request info stack null? This stack trace is the same thread as what just logged the underflow? We have the cursory glanced the code 765 * Convenience method to get the ClientRequestInfoImpl object off the 766 * top of the ThreadLocal stack. Throws an INTERNAL exception if 767 * the Info stack is empty. 768 */ 769 private ClientRequestInfoImpl peekClientRequestInfoImplStack() { 770 RequestInfoStack infoStack = 771 (RequestInfoStack)threadLocalClientRequestInfoStack.get(); 772 ClientRequestInfoImpl info = null; 773 if( !infoStack.empty() ) { 774 info = (ClientRequestInfoImpl)infoStack.peek(); 775 } else { 776 throw wrapper.clientInfoStackNull() ; 777 } 778 779 return info; 780 } We are not sure this is related to the timeout issue or not. We have suggested the customer to put 1. -ORBDebug transport,subcontract,transientObjectManager,invocationTiming to enable further debugging on the corba side 2. -verbose:gc -Xloggc:/path/to/gc-`/bin/date +%Y-%m-%d--%H-%M`.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps to check the jvm health 3. run snoop to capture the network packets when the problem happens next time Please review the attached logs and docs if this is a bug or not. Thanks
|