JDK-6351373 : (se) VM Socket's never gets CLOSED on Solaris, causing File Descriptors to Leak
  • Type: Bug
  • Component: core-libs
  • Sub-Component: java.nio
  • Affected Version: 5.0
  • Priority: P2
  • Status: Closed
  • Resolution: Duplicate
  • OS: solaris
  • CPU: sparc
  • Submitted: 2005-11-16
  • Updated: 2011-05-25
  • Resolved: 2005-11-18
Related Reports
Duplicate :  
Description
See bug 6321777 for the original bug filled agains SJSAS 8.1

[ from naveen gangadhar ]
Issue : Socket's never gets closed, causing File Descriptors to leak., eventually it leads to IO Exeption and service Hang.

Was able to reproduce it with #### 5.0 ONLY ####

Initial Note :
-------------
SJSAS PE hangs on load test. Upon investigation
found that Appserver eventually get's out of FD, hence cannot process any more connection, throwing an IO Exception - the reasons sockets, never gets closed correctly.

Would be interesting to see if this happens on 9 PE (as it to uses grizzly), so that we can fix the issue there too.

Also this issue has been discussed with Jeanfrancois Arcand, Binod and Scott oaks in the mailing list

Initial Error in Appserver, when the Problem is HIT.
----------------------------------------------------

When using Normal HTTP (port) Connection.

>[#|2005-09-05T10:09:02.112+0800|WARNING|sun-appserver-pe8.1_02|javax.enterprise.system.stream.err|_ThreadID=47;|
>java.io.IOException: Too many open files
> at sun.nio.ch.IOUtil.initPipe(Native Method)

Also seen with pre-configured SSL Connection. 

>[#|2005-09-05T10:09:04.687+0800|SEVERE|sun-appserver-pe8.1_02|javax.enterprise.system.container.web|_ThreadID=48;|WEB0765: http-listener
>[SSL: ServerSocket[addr=/0.0.0.0,port=0,localport=443]] shutdown due to exception: java.io.IOException: Too many open files|#]

My Investigation details
-------------------------

Appserver : SJSAS 8.1 UR2 PE
Configured Http Port: 8007

//Before starting the Appserver

# netstat -an |grep 8007 | wc -l
      0
# netstat -an | grep 8007 | grep BOUND | wc -l
      0
//Once the Appserver is started, the HTTP port 8007 is sucessfully LISTENING

# netstat -an | grep 8007
     *.8007               *.*                0      0 49152      0 LISTEN

//Once I fire an request, and close the coonection, the Socket gets to TIME_WAIT state.

# netstat -an | grep 8007
     *.8007               *.*                0      0 49152      0 LISTEN
129.158.131.60.8007  129.158.131.162.44527 49640      0 49640      0 TIME_WAIT

//Eventually after a delay of sometime(10-20 secs) instead of going to Closing State,
it gets to the BOUND state, as below

# netstat -an | grep 8007
     *.8007               *.*                0      0 49152      0 LISTEN
     *.8007               *.*                0      0 49152      0 BOUND

That means there are 2 Sockets open NOW, one LISTENING and the Other which should be in CLOSED, is in BOUND state.

This can be confirmed by the pfiles output.

bash-2.05# pfiles 5188 | grep 8007
       sockname: AF_INET 0.0.0.0  port: 8007
       sockname: AF_INET 0.0.0.0  port: 8007

//Testing it for another Connection NOW (for confirmation), i.e fire an request and close the connection, again, it goes to TIME_WAIT'state and then to BOUND instead of Close.

# netstat -an | grep 8007
     *.8007               *.*                0      0 49152      0 LISTEN
     *.8007               *.*                0      0 49152      0 BOUND
129.158.131.60.8007  129.158.131.162.44528 49640      0 49640      0 TIME_WAIT

//Checking netstat after some time.

# netstat -an | grep 8007
     *.8007               *.*                0      0 49152      0 LISTEN
     *.8007               *.*                0      0 49152      0 BOUND
     *.8007               *.*                0      0 49152      0 BOUND

Confirms that the Socket is NOT closed properly. Checking the pfiles output

bash-2.05# pfiles 5188 | grep 8007
       sockname: AF_INET 0.0.0.0  port: 8007
       sockname: AF_INET 0.0.0.0  port: 8007
       sockname: AF_INET 0.0.0.0  port: 8007

bash-2.05# pfiles 5188 | grep 8007 |wc -l
3

reveals there is leak i.e Sockets are NOT closed properly.

//truss reveals, that shutdown is called by the Appserver,but close is NOT called for socket to be closed.


5188/58:        682.5216        getsockopt(83, SOL_SOCKET, 0x2000, 0xCE67F420, 0xCE67F41C, 0) = 0
5188/58:        682.5217        setsockopt(83, SOL_SOCKET, 0x2000, 0xCE67F420, 4, 0) = 0
5188/58:        682.5218        fcntl(83, F_SETFL, 0x00000002)                  = 0
5188/58:        682.5220        shutdown(83, 0, 1)                              = 0
5188/58:        682.5221        shutdown(83, 1, 1)                              = 0
5188/8:         682.5600        poll(0xF977FD80, 0, 50)                         = 0

//Examining the pstack output and dbx for thread activity 58, and investigating the Appserver source code.

----------------- t@58 -----------------
0xfef1c760      _so_shutdown + 0x8
0xf980be48      * sun.nio.ch.SocketChannelImpl.shutdown(java.io.FileDescriptor, int) bci:0 methodOop:0xf5a7e738 (Interpreted fra
me)
0xf9805c64      * sun.nio.ch.SocketChannelImpl.shutdownInput() bci:32 line:576 methodOop:0xf5a7de38 (Interpreted frame)
0xf9805c64      * sun.nio.ch.SocketAdaptor.shutdownInput() bci:4 line:350 methodOop:0xf5a81358 (Interpreted frame)
0xf9805c64      * com.sun.enterprise.web.connector.grizzly.ProcessorTask.doTask() bci:219 line:436 methodOop:0xf6ee7770 (Interpr
eted frame)
0xf980612c      * com.sun.enterprise.web.connector.grizzly.WorkerThread.run() bci:21 line:55 methodOop:0xf6ee24f8 (Interpreted f
rame)
0xf9800118      <StubRoutines>
0xfe8c8004      void JavaCalls::call_helper(JavaValue*,methodHandle*,JavaCallArguments*,Thread*) + 0x274
0xfe8d6994      void JavaCalls::call_virtual(JavaValue*,KlassHandle,symbolHandle,symbolHandle,JavaCallArguments*,Thread*) + 0x16
4
0xfe8d6820      void JavaCalls::call_virtual(JavaValue*,Handle,KlassHandle,symbolHandle,symbolHandle,Thread*) + 0x60
0xfe8d67a8      void thread_entry(JavaThread*,Thread*) + 0x128
0xfe8d65d4      void JavaThread::run() + 0x288
0xfe8c6898      _start + 0x134
0xff1e57b4      _lwp_start
----------------- t@59 -----------------

(dbx) where
current thread: t@58
=>[1] _so_shutdown(0x0, 0x1, 0x1, 0xf98146c0, 0xce87f8d4, 0x0), at 0xfef1b7b8
 [2] Java_sun_nio_ch_SocketChannelImpl_shutdown(0x63d514, 0xce87f8e0, 0xce87f964, 0x1, 0xf6c0bab0, 0x0), at 0xfbbb4f28
 [3] 0xf980be48(0x1, 0xb8, 0xce87f968, 0x8, 0xd5d62668, 0xce87f8f8), at 0xf980be47
 [4] 0xf9805c64(0xd5d625d0, 0xb6, 0xce87fa84, 0xf98155d8, 0xf6b48f50, 0xce87f998), at 0xf9805c63
 [5] 0xf9805c64(0xd5d62890, 0xb6, 0xce87fb24, 0xf9815270, 0x0, 0xce87fa18), at 0xf9805c63
 [6] 0xf9805c64(0xe09ddc50, 0xf6b493d8, 0xce87fba4, 0xf9815270, 0xf6b493d8, 0xce87fab8), at 0xf9805c63
 [7] 0xf980612c(0xce87fba8, 0x0, 0x0, 0xf98156f0, 0x33f348, 0xce87fb48), at 0xf980612b
 [8] 0xf9800118(0xce87fc30, 0xce87fe98, 0xa, 0xf6b54ce8, 0xf980aae0, 0xce87fdb8), at 0xf9800117
 [9] JavaCalls::call_helper(0xce87fe90, 0xce87fcf8, 0xce87fdb0, 0x63d480, 0x63d480, 0xce87fd08), at 0xfe8c7b50
 [10] JavaCalls::call_virtual(0xfebfa000, 0x63da28, 0xce87fda4, 0xce87fda0, 0xce87fdb0, 0x63d480), at 0xfe8d6a38
 [11] JavaCalls::call_virtual(0xce87fe90, 0xce87fe8c, 0xce87fe84, 0xce87fe7c, 0xce87fe74, 0x63d480), at 0xfe8d68c4
 [12] thread_entry(0x63d480, 0x63d480, 0x59c628, 0x63da28, 0x323914, 0xfe8d6650), at 0xfe8d684c
 [13] JavaThread::run(0x63d480, 0xffffffe2, 0xfec193b8, 0xffff8000, 0x0, 0x0), at 0xfe8d6678
 [14] _start(0x63d480, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfe8c63e4

Comments
EVALUATION MRs 2131210 & 2131211 were recently putback for 5.0u7 and the submitter has verified that this addresses the problem.
18-11-2005