Duplicate :
|
|
Relates :
|
|
Relates :
|
When running runTheseC (compileThese) we've run into native OOME, primarily on 32 bit windows builds running on large windows machines. I left runTheseC running overnight on a Solaris machine in the hope of using libumem's memory leak detection but I couldn't get any useful information from it, but we're definitely leaking something: $ pmap 5287 |grep heap 0000000000411000 4193312K rw--- [ heap ] 0000000100319000 2350824K rw--- [ heap ] Metaspace usage is around 11MB with 40MB committed so we don't have a lot of live classes it seems. Using libumem to gather some snapshots of all malloc() calls in a run. One thing that shows up is allocation of ParkEvents which are leaked (intentionally, it appears). runThese aggresively spawns threads which open JAR files, which seem to end up in JVM_RawMonitorEnter: libumem.so.1`malloc+0x2e libjvm.so`__1cCosGmalloc6FLHpC_pv_+0x80 libjvm.so`__1cJParkEventIAllocate6FpnGThread__p0_+0x116 libjvm.so`__1cHMonitorMjvm_raw_lock6M_v_+0x248 libjvm.so`JVM_RawMonitorEnter+0x25 libzip.so`ZIP_Lock+0xd libzip.so`Java_java_util_zip_ZipFile_read+0x43 0xfffffd7fed812094 ParkEvents on Solaris are 440 bytes each, and there are >10000 of them on the ParkEvent::FreeList after an hour of running the compileThese version of runThese. I also tried an instrumented build on Windows, where I use HeapCreate to create a separate memory heap for allocating ParkEvents to be able to track them externally to the process. After running runTheseC for around 30 minutes that heap has grown to 256MB. A theory for the root cause of this is that ParkEvent::Allocate is not designed to handle the load of 15-16 threads contending on a Monitor* through the JVM_RawMonitor* API. Using the RawMonitor functions disallows the VM from using the JavaThread's ParkEvent and forces all those contending threads to hit ParkEvent::Allocate. ParkEvents are maintained on a lock-free free list which is designed to avoid ABA problems by doing push-one pop-all, so there is a potential for allocation spikes while one thread is CAS:ing on the FreeList. I=H (aggressive memory leak if this problem occurs, can easily lead to crash due to OOME) L=L (very unlikely situation) W=H (no known work-around if this situation arises)
|