JDK-6199899 : ClientNotifForwarder can wait infinitly when reconnecting
  • Type: Bug
  • Component: core-svc
  • Sub-Component: javax.management
  • Affected Version: 5.0
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2004-11-24
  • Updated: 2011-02-16
  • Resolved: 2004-12-16
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6 Other
5.0u2,jdmk5.1_03Fixed 6Fixed jdmk5.1_03Fixed
Description
Stack trace of waiting fetcher Thread:
    
[java] "Thread-2" daemon prio=10 tid=0x0055c458 nid=0x19 in Object.wait() [0xb0eff000..0xb0effc10]
     [java] 	at java.lang.Object.wait(Native Method)
     [java] 	- waiting on <0xe2d33930> (a com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$SOAPNotifClient)
     [java] 	at java.lang.Object.wait(Object.java:474)
     [java] 	at com.sun.jmx.remote.internal.ClientNotifForwarder.postReconnection(ClientNotifForwarder.java:249)
     [java] 	- locked <0xe2d33930> (a com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$SOAPNotifClient)
     [java] 	at com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$SOAPNotifClient.postReconnection(SOAPJMXConnector.java:1160)
     [java] 	- locked <0xe2d33930> (a com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$SOAPNotifClient)
     [java] 	at com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$CommunicatorAdmin.reconnectNotificationListeners(SOAPJMXConnector.java:218)
     [java] 	at com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$CommunicatorAdmin.doStart(SOAPJMXConnector.java:275)
     [java] 	at com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart(ClientCommunicatorAdmin.java:106)
     [java] 	at com.sun.jmx.remote.internal.ClientCommunicatorAdmin.gotIOException(ClientCommunicatorAdmin.java:34)
     [java] 	at com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$CommunicatorAdmin.gotIOException(SOAPJMXConnector.java:117)
     [java] 	at com.sun.jmx.remote.soap.stubs.SOAPJMXConnector$SOAPNotifClient.fetchNotifs(SOAPJMXConnector.java:1187)
     [java] 	at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.fetchNotifs(ClientNotifForwarder.java:420)
     [java] 	at com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:318)
     [java] 	at java.lang.Thread.run(Thread.java:595)


Comments
EVALUATION The fix was a code review and with some synchronization improvement, the problem reported to the JMX-FORUM showed that it is not enough at least. After analyzing the full stack trace, it seems that the flage ClientNotifForwarder.beingReconnected is not set back to false in some case, it may happen that the method preReconnection was called but postReconnection was not called later, the Javadoc tells explicitly that it has to call postReconnection if preReconnection is called.
23-05-2007

EVALUATION The problem appears during fetchNotifs. The notif fetching thread gets IOException and then is used to do restart, this thread calls preReconnection and sets state to STOPPING, after that the thread calls postConnection, so it is blocked forever to wait the STOPPED state. ###@###.### 2004-11-24 ###@###.### 2004-11-24 14:06:34 GMT The situation that provokes this problem doesn't usually arise. For it to happen, a fetchNotifs call must fail with an IOException, which will cause it to attempt to reconnect, and the reconnection attempt must succeed. Usually when you get an IOException it is for good so the reconnection will fail too. The main case where this is not true is for the server's idle timeout, which is the reason for the reconnection logic. If no request is received from a client for a long period, the server assumes it is dead and closes the connection. If the client is not in fact dead, its next operation will get an IOException and reconnect. This situation does not happen with the normal configuration, because the fetchNotifs operation itself prevents the server from considering the client idle. If the client has no listeners it will not call fetchNotifs and the problem will not arise either. ###@###.### 2004-11-24 15:01:59 GMT
24-11-2004