JDK-6625569 : Sporadic OutOfMemoryError on Web Server with sufficiently large heap on Solaris sparc/x86, HPUX
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 5.0u12
  • Priority: P2
  • Status: Closed
  • Resolution: Not an Issue
  • OS: generic
  • CPU: generic
  • Submitted: 2007-11-02
  • Updated: 2011-05-28
  • Resolved: 2007-11-14
Related Reports
Relates :  
Description
After the web server startup, as the requests start hitting the server,
we initially create 64 native threads. While we are in the process of 
serving these requests, we attach them to JVM thread using JNI 
*AttachCurrentThread* method if the request has to be served by the 
servlet container (written in Java). 

We use *sticky attach* so that attached thread remains attached through 
out the life of the server. The code has been working fine for more than 
3 years. With our upcoming release, in recent times, we have been seeing 
a weird problem very sporadically.

When we run our webapp tests, we see 
   "java.lang.OutOfMemoryError: Java heap space" 
in 3-6 minutes and the next AttachCurrentThread call 
(typically while attaching 52nd, 65th thread to JVM thread) fails. 

When this problem occurred:
 - We used Xms128m, Xmx256m
 - There are about 1200 requests to several webapps. 
 - The order in which these request hit has been the same.
 - It occured when the server is run in 32-bit mode

Xms128m is *more than enough* for any normal run.
For ex.
 - On mbelshe.red.iplanet.com, I could run the same tests
   more than 300 times repeatedly with *Xms64 Xmx128* with GC invocations
   not more than 30+ with YoungGen/OldGen/PermGen all < 90% used (max)
   but this problem was observed with Xms128, Xmx256 only once.
The same tests normally run thousands of times successfully with much less 
java heap than Xms128m Xmx256m. 

It occurred on 5 different m/c (3 solaris sparc, 1 Solaris X86, 1 HP-UX)
in past 4-6 weeks but *only once* on each of them. We are having hard time
to reproduce it. Since then, we have been trying to capture the GC
details on these m/c but we could not reproduce yet.

M/c on which the problem occurred:
==================================
mbelshe.red.iplanet.com
  SunOS mbelshe 5.10 Generic_118833-22 sun4u sparc SUNW,Sun-Fire-880
  8 sparcv9 processors @1050 MHz, MEMORY: 16384MB, SWAP: 45038.4MB

mcfly2.red.iplanet.com
  SunOS mcfly2 5.10 Generic_118855-15 i86pc i386 i86pc Solaris
  2 i386 processors @2200 MHz, MEMORY: 2048MB, SWAP: 4981.2MB

krishna2.red.iplanet.com
  SunOS krishna2 5.9 Generic_112233-12 sun4u sparc SUNW,Sun-Blade-1000
  2 sparcv9 processors @ 900 MHz, MEMORY: 2048MB, SWAP: 2449.6MB

hpneutro.red.iplanet.com
  HP-UX hpneutro B.11.11 U 9000/800 1553720588

wspqex62.india.sun.com
  SunOS wspqex62 5.10 Generic i86pc i386 i86pc
  2 i386 processors @ 2192 MHz, MEMORY: 2048MB, SWAP: 1937.9MB

JDK Version 
===========
 On all Solaris m/c: 1.5.0_12

 On HP UX m/c: 1.5.0.03 (build 1.5.0.03-_13_feb_2006_16_39)
               Java HotSpot(TM) Server VM 
               (build 1.5.0.03 jinteg:02.13.06-15:51 PA2.0 (aCC_AP), mixed mode)
===========

Sample error log
================
bash-2.05b# hostname
hpneutro

bash-2.05b# domainname
red.iplanet.com

bash-2.05b# more /export/tinderbox/PA-RISC/iplanet.prior.core.Oct18/ias/server/work/B1/HP-UXB.11.11_OPT.OBJ/https-test/logs/errors
------
[18/Oct/2007:05:24:05] fine (28072): Attaching to JVM thread service-j2ee-65
[18/Oct/2007:05:24:06] fine (28072): entering high concurrency mode
[18/Oct/2007:05:24:06] failure (28072): CORE4007: Internal error: Unexpected Java exception thrown (java.lang.OutOfMemoryError: Java heap space, Java heap space), stack: java.lang.OutOfMemoryError: Java heap space
[18/Oct/2007:05:24:06] failure (28072): CORE4001: Internal error: Unable to attach to the JVM
------

More data:
- Currently added -XX:+PrintGCDetails but could not reproduce
- Also added -XX:+HeapDumpOnOutOfMemoryError.
- If I reduce to Xms64 Xmx64 on one m/c, I am able to reproduce.
  It is *very much* possible that the heap could be *very less* for that m/c.
  GC output/Jstat output can be seen at
    http://javaweb.sfbay.sun.com/~kmeduri/share-docs/ws_outofmemory/run1/
  This was captured when I was desparately trying to reproduce.
  The OutOfMemory here could be legitimate.

Comments
EVALUATION The Java heap was bearly sufficient for the load. Closing as not a defect.
14-11-2007