JDK-6709675 : OutOfMemoryError observed during longevity run of Galactica app with GlassFish V3
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 5.0u15
  • Priority: P4
  • Status: Closed
  • Resolution: Won't Fix
  • OS: solaris_10
  • CPU: sparc
  • Submitted: 2008-06-02
  • Updated: 2011-07-14
  • Resolved: 2011-07-14
Related Reports
Relates :  
Description
When running longevity test with a webservices application Galactica ontop of GlassFish V3 Application server with JDK1.5.0_15, I am running into the following "OutOfMemoryError" after about 40 hours into the run. This is always happening on the client(faban) side and not on the server side. At this time, the client has crashed. Only the appserver process is running. 

   [java] Exception java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space?
     [java] Java Result: 1

Both the client and server are running on the same machine bigapp-niagara-3.red.iplanet.com. The server takes up about 20% cpu usage while the client takes up about 50% cpu usage on the system. 

The following heap settings were used on both client and server
-Xms1500m -Xmx1500

Turning on gc log shows that the above heap setting is sufficient. 
At around 40 hours, this is what it show: 
143995.691: [GC 300553K->98045K(937024K), 0.0155019 secs]
143998.441: [GC 298749K->98009K(934784K), 0.0145740 secs]
144001.141: [GC 296409K->98105K(932544K), 0.0184814 secs]
144003.861: [GC 294265K->98441K(930304K), 0.0185634 secs]
144006.515: [GC 292425K->98569K(928128K), 0.0165335 secs]
144009.221: [GC 290377K->98561K(926016K), 0.0187481 secs]
144011.841: [GC 288257K->98729K(923904K), 0.0170061 secs]
144014.461: [GC 286377K->98813K(921856K), 0.0174764 secs]
144017.081: [GC 284413K->98877K(919872K), 0.0373795 secs]
144019.791: [GC 282493K->99481K(917952K), 0.0313093 secs]

At about 50 hours into the run, this is what the gc shows:
148882.024: [GC 659999K->438967K(954816K), 0.0186480 secs]
148885.031: [GC 657463K->439279K(952064K), 0.0218174 secs]
148888.041: [GC 655087K->439987K(949440K), 0.0137174 secs]
148891.001: [GC 653171K->439779K(946944K), 0.0262797 secs]
148893.972: [GC 650403K->440223K(944384K), 0.0195110 secs]
148896.824: [GC 648351K->440119K(941952K), 0.0183351 secs]
180402.488: [GC 645815K->443243K(939584K), 0.1354868 secs]

The following is data collected on the swap space of the system
*****************************************************************
At the beginning of the run:

last pid:  6659;  load averages: 17.00, 16.01, 10.06    17:40:51
93 processes:  89 sleeping, 4 on cpu

Memory: 8184M real, 3435M swap in use, 5270M swap free


  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
 6368 root      76  10    0 1759M 1374M cpu21 147:18 50.40% java
 6312 root      59  20    0 1440M  462M cpu12  52:33 18.71% java
 6624 root       1   0    0 3384K 2000K cpu3    0:00  0.14% top
 5076 noaccess  47  59    0  222M  133M sleep   1:02  0.01% java
 6622 root       1  59    0 3352K 3320K sleep   0:00  0.01% prstat

At around 41 hours into the run: 

last pid: 10810;  load averages: 17.18, 17.12, 17.10    10:42:55
93 processes:  90 sleeping, 3 on cpu

Memory: 8184M real, 7119M swap in use, 1578M swap free


  PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
 6368 root      76   0    0 3961M 3617M cpu0  506.9H 51.27% java
 6312 root      59   0    0 2916M 2832M cpu1  187.4H 18.91% java
10777 root       1   0    0 3384K 2000K cpu12   0:00  0.14% top
 6396 root       1  59    0 6992K 5712K sleep   8:02  0.02% netstats.sh
 6506 root       1  59    0 6992K 5720K sleep   8:01  0.01% netstats.sh

About 5 GB of swap is being used. 

Here's the pmap -x information of the client process, doing a diff of the files taken at steady state time and at around 41 hours: 

bigapp-niagara-3(root):meena-scripts/logs ->diff pmap.faban.steadystate pmap.faban.147600 | more
1a2
> 6368: /export/home/jdk/jdk150_15/jre/bin/java -server -Xms1500m -Xmx1500m -X
6,7c7,10
< 00040000    3840    3840    3840       - rwx--    [ heap ]
< 00400000   45056   45056   45056       - rwx--    [ heap ]
---
> 00040000    3840    3840    3648       - rwx--    [ heap ]
> 00400000  258048  258048  258048       - rwx--    [ heap ]
> 10000000 1835008 1835008 1835008       - rwx--    [ heap ]
> 80000000  207424  207424  198984       - rwx--    [ heap ]
12c15,16
< 8D1F2000      56      56      56       - rwx-R    [ stack tid=73 ]
---
> 8D1F2000       8       8       -       - rwx-R    [ anon ]
> 8D1F4000      48      48      48       - rwx-R    [ stack tid=73 ]
23c27,28
< 8DCF2000      56      56      56       - rwx-R    [ stack tid=62 ]
---
> 8DCF2000       8       8       -       - rwx-R    [ anon ]
> 8DCF4000      48      48      48       - rwx-R    [ stack tid=62 ]
26c31,32
< 8DFF4000      48      48      48       - rwx-R    [ stack tid=59 ]
---
> 8DFF4000       8       8       -       - rwx-R    [ anon ]
> 8DFF6000      40      40      40       - rwx-R    [ stack tid=59 ]
44c50,51
< 8F1F4000      48      48      48       - rwx-R    [ stack tid=41 ]
---
> 8F1F4000       8       8       -       - rwx-R    [ anon ]
> 8F1F6000      40      40      40       - rwx-R    [ stack tid=41 ]
48c55,56
< 8F5F2000      56      56      56       - rwx-R    [ stack tid=37 ]
---
> 8F5F2000      16      16       8       - rwx-R    [ anon ]
> 8F5F6000      40      40      40       - rwx-R    [ stack tid=37 ]
50c58,59
< 8F7F8000      32      32      32       - rwx-R    [ stack tid=35 ]
---
> 8F7F8000      16      16       8       - rwx-R    [ anon ]
> 8F7FC000      16      16      16       - rwx-R    [ stack tid=35 ]
55,56c64,65
< 8FC80000    3552    3544       -       - r--s-  dev:32,7 ino:93907
< 90000000    9808    9808       -       - r--s-  dev:32,7 ino:93899
---
> 8FC80000    3552    3520       -       - r--s-  dev:32,7 ino:93907
> 90000000    9808    9800       -       - r--s-  dev:32,7 ino:93899
58,60c67,69
< 90C80000    1616    1616       -       - r--s-  dev:32,7 ino:94024
< 90E80000     672     672       -       - r--s-  dev:32,7 ino:94108
< 90F80000    4064    4056       -       - r--s-  dev:32,7 ino:94055
---
> 90C80000    1616    1496       -       - r--s-  dev:32,7 ino:94024
> 90E80000     672     656       -       - r--s-  dev:32,7 ino:94108
> 90F80000    4064    3552       -       - r--s-  dev:32,7 ino:94055
62,66c71,75
< 91480000     520     520       -       - r--s-  dev:32,7 ino:93965
< 91580000    1288    1288       -       - r--s-  dev:32,7 ino:94056
< 91700000     544     544       -       - r--s-  dev:32,7 ino:93945
< 91800000     512     512       -       - r--s-  dev:32,7 ino:94009
< 91900000     792     328       -       - r--s-  dev:32,7 ino:17476
---
> 91480000     520     480       -       - r--s-  dev:32,7 ino:93965
> 91580000    1288    1168       -       - r--s-  dev:32,7 ino:94056
> 91700000     544     504       -       - r--s-  dev:32,7 ino:93945
> 91800000     512     496       -       - r--s-  dev:32,7 ino:94009
> 91900000     792     304       -       - r--s-  dev:32,7 ino:17476
68,69c77,78
< 91B7E000       8       8       8       - rwx-R    [ stack tid=29 ]
< 91C76000      16      16      16       - rwx-R    [ anon ]
---
> 91B7E000       8       8       -       - rwx-R    [ stack tid=29 ]
> 91C76000      16      16       8       - rwx-R    [ anon ]
84c93
< 95900000    2864    2856    2856       - rwx--    [ anon ]
---
> 95900000    2864    2856    2768       - rwx--    [ anon ]
86c95
< 9A400000 1462272 1150976 1150976       - rwx--    [ anon ]
---
> 9A400000 1462272 1204224 1204224       - rwx--    [ anon ]
96c105
< F7B7A000      32      32      32       - rwx--    [ anon ]
---
> F7B7A000      32      32      24       - rwx--    [ anon ]
102c111
< F7DD0000     128     128     128       - rwx--    [ anon ]
---
> F7DD0000     128     128      64       - rwx--    [ anon ]
112,113c121,122
< F7FA6000       8       8       8       - rwx--  libnio.so
< F7FB0000     176     176       -       - r--s-  dev:32,7 ino:93871
---
> F7FA6000       8       8       -       - rwx--  libnio.so
> F7FB0000     176     152       -       - r--s-  dev:32,7 ino:93871
116c125
< F80A0000     144     144       -       - r--s-  dev:32,7 ino:94102
---
> F80A0000     144     128       -       - r--s-  dev:32,7 ino:94102
132c141
< F83D0000     168     168       -       - r--s-  dev:32,7 ino:94023
---
> F83D0000     168     160       -       - r--s-  dev:32,7 ino:94023
167,168c176,177
< FA9B0000      56      56       -       - r--s-  dev:32,7 ino:94020
< FA9D0000      80      80       -       - r--s-  dev:32,7 ino:93973
---
> FA9B0000      56      48       -       - r--s-  dev:32,7 ino:94020
> FA9D0000      80      72       -       - r--s-  dev:32,7 ino:93973
177,178c186,187
< FABA0000     104     104       -       - r--s-  dev:32,7 ino:93936
< FABC0000      64      64       -       - r--s-  dev:32,7 ino:94030
---
> FABA0000     104      96       -       - r--s-  dev:32,7 ino:93936
> FABC0000      64      56       -       - r--s-  dev:32,7 ino:94030
180c189
< FAC00000    8640    4224       -       - r--s-  dev:32,7 ino:15583
---
> FAC00000    8640    3768       -       - r--s-  dev:32,7 ino:15583
182c191
< FB4EE000      16      16      16       - rwx--  pkcs11_softtoken.so.1
---
> FB4EE000      16      16       8       - rwx--  pkcs11_softtoken.so.1
185c194
< FB5A0000      72      72       -       - r-x--  libnet.so
---
> FB5A0000      72      64       -       - r-x--  libnet.so
201,202c210,211
< FB800000   39080   38544       -       - r--s-  dev:32,7 ino:15584
< FDE30000     192     192       -       - r--s-  dev:32,7 ino:89347
---
> FB800000   39080   36184       -       - r--s-  dev:32,7 ino:15584
> FDE30000     192     168       -       - r--s-  dev:32,7 ino:89347
231c240
< FE000000    4400    4400       -       - r--s-  dev:32,7 ino:104833
---
> FE000000    4400    4240       -       - r--s-  dev:32,7 ino:104833
235c244
< FE47C000      32      32      32       - rwx--    [ anon ]
---
> FE47C000      32      32      24       - rwx--    [ anon ]
239c248
< FE500000     536     520       -       - r--s-  dev:32,7 ino:15569
---
> FE500000     536     448       -       - r--s-  dev:32,7 ino:15569
273c282
< FF0C2000       8       8       8       - rwx--  libdoor.so.1
---
> FF0C2000       8       8       -       - rwx--  libdoor.so.1
275,276c284,285
< FF0F8000       8       8       8       - rwx--  libscf.so.1
< FF100000     584     576       -       - r-x--  libnsl.so.1
---
> FF0F8000       8       8       -       - rwx--  libscf.so.1
> FF100000     584     584       -       - r-x--  libnsl.so.1
284c293
< FF20C000      16       8       8       - rwx--  libCrun.so.1
---
> FF20C000      16       8       -       - rwx--  libCrun.so.1
303c312
< total Kb 1801344 1462024 1373088       -
---
> total Kb 4056768 3766632 3672856       -

I have also collected the findleaks report using libumem and redirected the output to a file. I am attaching the 2 files to this bug report. 1 file is taken at beginning of the run and the other was taken at around 39 hours into the run. I am not sure how to read these files so please look at them and let me know. 

I have also attached the pmap files to this bug report.

Comments
EVALUATION 5.0 is phasing out and no fix for this bug.
2011-07-14