When running longevity test with a webservices application Galactica ontop of GlassFish V3 Application server with JDK1.5.0_15, I am running into the following "OutOfMemoryError" after about 40 hours into the run. This is always happening on the client(faban) side and not on the server side. At this time, the client has crashed. Only the appserver process is running.
[java] Exception java.lang.OutOfMemoryError: requested 32756 bytes for ChunkPool::allocate. Out of swap space?
[java] Java Result: 1
Both the client and server are running on the same machine bigapp-niagara-3.red.iplanet.com. The server takes up about 20% cpu usage while the client takes up about 50% cpu usage on the system.
The following heap settings were used on both client and server
-Xms1500m -Xmx1500
Turning on gc log shows that the above heap setting is sufficient.
At around 40 hours, this is what it show:
143995.691: [GC 300553K->98045K(937024K), 0.0155019 secs]
143998.441: [GC 298749K->98009K(934784K), 0.0145740 secs]
144001.141: [GC 296409K->98105K(932544K), 0.0184814 secs]
144003.861: [GC 294265K->98441K(930304K), 0.0185634 secs]
144006.515: [GC 292425K->98569K(928128K), 0.0165335 secs]
144009.221: [GC 290377K->98561K(926016K), 0.0187481 secs]
144011.841: [GC 288257K->98729K(923904K), 0.0170061 secs]
144014.461: [GC 286377K->98813K(921856K), 0.0174764 secs]
144017.081: [GC 284413K->98877K(919872K), 0.0373795 secs]
144019.791: [GC 282493K->99481K(917952K), 0.0313093 secs]
At about 50 hours into the run, this is what the gc shows:
148882.024: [GC 659999K->438967K(954816K), 0.0186480 secs]
148885.031: [GC 657463K->439279K(952064K), 0.0218174 secs]
148888.041: [GC 655087K->439987K(949440K), 0.0137174 secs]
148891.001: [GC 653171K->439779K(946944K), 0.0262797 secs]
148893.972: [GC 650403K->440223K(944384K), 0.0195110 secs]
148896.824: [GC 648351K->440119K(941952K), 0.0183351 secs]
180402.488: [GC 645815K->443243K(939584K), 0.1354868 secs]
The following is data collected on the swap space of the system
*****************************************************************
At the beginning of the run:
last pid: 6659; load averages: 17.00, 16.01, 10.06 17:40:51
93 processes: 89 sleeping, 4 on cpu
Memory: 8184M real, 3435M swap in use, 5270M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
6368 root 76 10 0 1759M 1374M cpu21 147:18 50.40% java
6312 root 59 20 0 1440M 462M cpu12 52:33 18.71% java
6624 root 1 0 0 3384K 2000K cpu3 0:00 0.14% top
5076 noaccess 47 59 0 222M 133M sleep 1:02 0.01% java
6622 root 1 59 0 3352K 3320K sleep 0:00 0.01% prstat
At around 41 hours into the run:
last pid: 10810; load averages: 17.18, 17.12, 17.10 10:42:55
93 processes: 90 sleeping, 3 on cpu
Memory: 8184M real, 7119M swap in use, 1578M swap free
PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
6368 root 76 0 0 3961M 3617M cpu0 506.9H 51.27% java
6312 root 59 0 0 2916M 2832M cpu1 187.4H 18.91% java
10777 root 1 0 0 3384K 2000K cpu12 0:00 0.14% top
6396 root 1 59 0 6992K 5712K sleep 8:02 0.02% netstats.sh
6506 root 1 59 0 6992K 5720K sleep 8:01 0.01% netstats.sh
About 5 GB of swap is being used.
Here's the pmap -x information of the client process, doing a diff of the files taken at steady state time and at around 41 hours:
bigapp-niagara-3(root):meena-scripts/logs ->diff pmap.faban.steadystate pmap.faban.147600 | more
1a2
> 6368: /export/home/jdk/jdk150_15/jre/bin/java -server -Xms1500m -Xmx1500m -X
6,7c7,10
< 00040000 3840 3840 3840 - rwx-- [ heap ]
< 00400000 45056 45056 45056 - rwx-- [ heap ]
---
> 00040000 3840 3840 3648 - rwx-- [ heap ]
> 00400000 258048 258048 258048 - rwx-- [ heap ]
> 10000000 1835008 1835008 1835008 - rwx-- [ heap ]
> 80000000 207424 207424 198984 - rwx-- [ heap ]
12c15,16
< 8D1F2000 56 56 56 - rwx-R [ stack tid=73 ]
---
> 8D1F2000 8 8 - - rwx-R [ anon ]
> 8D1F4000 48 48 48 - rwx-R [ stack tid=73 ]
23c27,28
< 8DCF2000 56 56 56 - rwx-R [ stack tid=62 ]
---
> 8DCF2000 8 8 - - rwx-R [ anon ]
> 8DCF4000 48 48 48 - rwx-R [ stack tid=62 ]
26c31,32
< 8DFF4000 48 48 48 - rwx-R [ stack tid=59 ]
---
> 8DFF4000 8 8 - - rwx-R [ anon ]
> 8DFF6000 40 40 40 - rwx-R [ stack tid=59 ]
44c50,51
< 8F1F4000 48 48 48 - rwx-R [ stack tid=41 ]
---
> 8F1F4000 8 8 - - rwx-R [ anon ]
> 8F1F6000 40 40 40 - rwx-R [ stack tid=41 ]
48c55,56
< 8F5F2000 56 56 56 - rwx-R [ stack tid=37 ]
---
> 8F5F2000 16 16 8 - rwx-R [ anon ]
> 8F5F6000 40 40 40 - rwx-R [ stack tid=37 ]
50c58,59
< 8F7F8000 32 32 32 - rwx-R [ stack tid=35 ]
---
> 8F7F8000 16 16 8 - rwx-R [ anon ]
> 8F7FC000 16 16 16 - rwx-R [ stack tid=35 ]
55,56c64,65
< 8FC80000 3552 3544 - - r--s- dev:32,7 ino:93907
< 90000000 9808 9808 - - r--s- dev:32,7 ino:93899
---
> 8FC80000 3552 3520 - - r--s- dev:32,7 ino:93907
> 90000000 9808 9800 - - r--s- dev:32,7 ino:93899
58,60c67,69
< 90C80000 1616 1616 - - r--s- dev:32,7 ino:94024
< 90E80000 672 672 - - r--s- dev:32,7 ino:94108
< 90F80000 4064 4056 - - r--s- dev:32,7 ino:94055
---
> 90C80000 1616 1496 - - r--s- dev:32,7 ino:94024
> 90E80000 672 656 - - r--s- dev:32,7 ino:94108
> 90F80000 4064 3552 - - r--s- dev:32,7 ino:94055
62,66c71,75
< 91480000 520 520 - - r--s- dev:32,7 ino:93965
< 91580000 1288 1288 - - r--s- dev:32,7 ino:94056
< 91700000 544 544 - - r--s- dev:32,7 ino:93945
< 91800000 512 512 - - r--s- dev:32,7 ino:94009
< 91900000 792 328 - - r--s- dev:32,7 ino:17476
---
> 91480000 520 480 - - r--s- dev:32,7 ino:93965
> 91580000 1288 1168 - - r--s- dev:32,7 ino:94056
> 91700000 544 504 - - r--s- dev:32,7 ino:93945
> 91800000 512 496 - - r--s- dev:32,7 ino:94009
> 91900000 792 304 - - r--s- dev:32,7 ino:17476
68,69c77,78
< 91B7E000 8 8 8 - rwx-R [ stack tid=29 ]
< 91C76000 16 16 16 - rwx-R [ anon ]
---
> 91B7E000 8 8 - - rwx-R [ stack tid=29 ]
> 91C76000 16 16 8 - rwx-R [ anon ]
84c93
< 95900000 2864 2856 2856 - rwx-- [ anon ]
---
> 95900000 2864 2856 2768 - rwx-- [ anon ]
86c95
< 9A400000 1462272 1150976 1150976 - rwx-- [ anon ]
---
> 9A400000 1462272 1204224 1204224 - rwx-- [ anon ]
96c105
< F7B7A000 32 32 32 - rwx-- [ anon ]
---
> F7B7A000 32 32 24 - rwx-- [ anon ]
102c111
< F7DD0000 128 128 128 - rwx-- [ anon ]
---
> F7DD0000 128 128 64 - rwx-- [ anon ]
112,113c121,122
< F7FA6000 8 8 8 - rwx-- libnio.so
< F7FB0000 176 176 - - r--s- dev:32,7 ino:93871
---
> F7FA6000 8 8 - - rwx-- libnio.so
> F7FB0000 176 152 - - r--s- dev:32,7 ino:93871
116c125
< F80A0000 144 144 - - r--s- dev:32,7 ino:94102
---
> F80A0000 144 128 - - r--s- dev:32,7 ino:94102
132c141
< F83D0000 168 168 - - r--s- dev:32,7 ino:94023
---
> F83D0000 168 160 - - r--s- dev:32,7 ino:94023
167,168c176,177
< FA9B0000 56 56 - - r--s- dev:32,7 ino:94020
< FA9D0000 80 80 - - r--s- dev:32,7 ino:93973
---
> FA9B0000 56 48 - - r--s- dev:32,7 ino:94020
> FA9D0000 80 72 - - r--s- dev:32,7 ino:93973
177,178c186,187
< FABA0000 104 104 - - r--s- dev:32,7 ino:93936
< FABC0000 64 64 - - r--s- dev:32,7 ino:94030
---
> FABA0000 104 96 - - r--s- dev:32,7 ino:93936
> FABC0000 64 56 - - r--s- dev:32,7 ino:94030
180c189
< FAC00000 8640 4224 - - r--s- dev:32,7 ino:15583
---
> FAC00000 8640 3768 - - r--s- dev:32,7 ino:15583
182c191
< FB4EE000 16 16 16 - rwx-- pkcs11_softtoken.so.1
---
> FB4EE000 16 16 8 - rwx-- pkcs11_softtoken.so.1
185c194
< FB5A0000 72 72 - - r-x-- libnet.so
---
> FB5A0000 72 64 - - r-x-- libnet.so
201,202c210,211
< FB800000 39080 38544 - - r--s- dev:32,7 ino:15584
< FDE30000 192 192 - - r--s- dev:32,7 ino:89347
---
> FB800000 39080 36184 - - r--s- dev:32,7 ino:15584
> FDE30000 192 168 - - r--s- dev:32,7 ino:89347
231c240
< FE000000 4400 4400 - - r--s- dev:32,7 ino:104833
---
> FE000000 4400 4240 - - r--s- dev:32,7 ino:104833
235c244
< FE47C000 32 32 32 - rwx-- [ anon ]
---
> FE47C000 32 32 24 - rwx-- [ anon ]
239c248
< FE500000 536 520 - - r--s- dev:32,7 ino:15569
---
> FE500000 536 448 - - r--s- dev:32,7 ino:15569
273c282
< FF0C2000 8 8 8 - rwx-- libdoor.so.1
---
> FF0C2000 8 8 - - rwx-- libdoor.so.1
275,276c284,285
< FF0F8000 8 8 8 - rwx-- libscf.so.1
< FF100000 584 576 - - r-x-- libnsl.so.1
---
> FF0F8000 8 8 - - rwx-- libscf.so.1
> FF100000 584 584 - - r-x-- libnsl.so.1
284c293
< FF20C000 16 8 8 - rwx-- libCrun.so.1
---
> FF20C000 16 8 - - rwx-- libCrun.so.1
303c312
< total Kb 1801344 1462024 1373088 -
---
> total Kb 4056768 3766632 3672856 -
I have also collected the findleaks report using libumem and redirected the output to a file. I am attaching the 2 files to this bug report. 1 file is taken at beginning of the run and the other was taken at around 39 hours into the run. I am not sure how to read these files so please look at them and let me know.
I have also attached the pmap files to this bug report.