Application fails to start with G1 on particular machine with a crash. The machine has 5G of memory and java is run with -Xmx2621440K (2560M).
The failure appears to be caused by combination of two factors:
1. Increase in footprint for G1: although live objects never occupy more than 350M, the heap quickly grows to 2.4G (see attached gc_g1.log). Compared with Serial GC and CMS, it is 5x, compared with Parallel GC, it is ~2x.
The process size as reported by top is 2.6G. Here is the progress:
$ grep '\-\>' gc_g1.log
[ 1023K->470K(80M)]
[ 11M->3016K(80M)]
[ 21M->8512K(80M)]
[ 29M->15M(80M)]
[ 24M->19M(80M)]
[ 42M->29M(160M)]
[ 51M->37M(320M)]
[ 49M->38M(640M)]
[ 64M->40M(1024M)]
[ 89M->44M(1332M)]
[ 107M->46M(1578M)]
[ 125M->54M(1775M)]
[ 122M->65M(1932M)]
[ 149M->78M(2058M)]
[ 151M->79M(2159M)]
[ 209M->93M(2240M)]
[ 157M->105M(2304M)]
[ 202M->121M(2356M)]
[ 209M->128M(2397M)]
[ 273M->152M(2430M)]
[ 214M->158M(2456M)]
[ 314M->161M(2477M)]
See below for other collectors data.
2. the fact that application spawns child processes during startup, essentially doubling the process size and it causes swap exhaustion (see 5049299).
Logs for other GCs for comparison:
Serial GC (max old gen ~220M, process size ~500M)
...
47.138: [GC 47.138: [DefNew: 65528K->3095K(66240K), 0.1228152 secs] 193276K->135631K(212688K), 0.1232398 secs] [Times: user=0.16 sys=0.00, real=0.12 secs]
48.050: [GC 48.051: [DefNew: 61975K->5073K(66240K), 0.1061452 secs] 194511K->137609K(212688K), 0.1066077 secs] [Times: user=0.20 sys=0.00, real=0.11 secs]
48.895: [GC 48.895: [DefNew: 63953K->4899K(66240K), 0.1189718 secs] 196489K->138745K(212688K), 0.1194711 secs] [Times: user=0.24 sys=0.00, real=0.12 secs]
49.697: [GC 49.697: [DefNew: 63779K->4996K(66240K), 0.1143990 secs] 197625K->140077K(212688K), 0.1148809 secs] [Times: user=0.20 sys=0.00, real=0.12 secs]
CMS (max old gen ~220M, process size 500M):
...
47.557: [GC 47.557: [ParNew: 19056K->2112K(19136K), 0.0139841 secs] 177006K->160114K(223248K), 0.0145109 secs] [Times: user=0.11 sys=0.00, real=0.01 secs]
47.761: [GC 47.762: [ParNew: 19136K->1690K(19136K), 0.0359526 secs] 177138K->161568K(223248K), 0.0364501 secs] [Times: user=0.21 sys=0.00, real=0.04 secs]
47.986: [GC 47.986: [ParNew: 18714K->2112K(19136K), 0.0126478 secs] 178592K->161994K(223248K), 0.0131672 secs] [Times: user=0.10 sys=0.01, real=0.01 secs]
48.193: [GC 48.194: [ParNew: 19136K->2112K(19136K), 0.0309732 secs] 179018K->163064K(223248K), 0.0316050 secs] [Times: user=0.21 sys=0.01, real=0.03 secs]
Parallel GC (max old gen ~750M, process size 1G):
...
39.500: [GC [PSYoungGen: 411824K->30731K(501632K)] 501936K->120843K(669568K), 0.4409441 secs] [Times: user=1.42 sys=0.67, real=0.44 secs]
45.469: [GC [PSYoungGen: 480843K->32788K(501440K)] 570955K->122900K(669376K), 0.3268305 secs] [Times: user=1.41 sys=0.51, real=0.33 secs]
50.500: [GC [PSYoungGen: 482900K->36737K(581568K)] 573012K->126849K(749504K), 0.5064412 secs] [Times: user=1.47 sys=0.94, real=0.51 secs]