JDK-6280181 : Concurrently memory allocation and JNI CS provoke OOM
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 6
  • Priority: P3
  • Status: Closed
  • Resolution: Fixed
  • OS: generic
  • CPU: generic
  • Submitted: 2005-06-03
  • Updated: 2012-02-01
  • Resolved: 2005-07-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other JDK 6
1.4.2_11Fixed 6 b44Fixed
Related Reports
Relates :  
Description
Despite fix for 6186200 was delivered we have still some problems in memory allocation concurently with native threads entering JNI critical sections. Later improvements in GC locker as reflected with Mustang jplan-283 feature give us hope on increasing concurrency of JNI CS and java threads.

I have created two test to cover this (in attachment). 

note: gcl001 is stress test so you may need additional tuning to reproduce its failures. -Xmx64M almost required.

My runs provide following results (Mustang b37):
1) linux-suse9.2-i586
  1.1) -XX:+UseSerialGC (default for client)
	gcl001 - FAIL, b6186200 - PASS
  1.2) -XX:+UseParallelGC
	gcl001 - PASS, b6186200 - FAIL
  1.3) -XX:+UseConcMarkAndSweepGC
	gcl001 - FAIL, b6186200 - PASS

2) Solaris-sparc 10
  1.1) -XX:+UseSerialGC (default for client)
	gcl001 - FAIL, b6186200 - PASS
  1.2) -XX:+UseParallelGC
	gcl001 - PASS, b6186200 - PASS
  1.3) -XX:+UseConcMarkAndSweepGC
	gcl001 - PASS(slow), b6186200 - PASS


PS More intresting fact is that b6186200 never fails on solaris even with very old builds. So it worked fine ever before 6186200 integration. Linux version produce failure as expected.

PPS Full test names (for UTB script):
nsk/stress/jni/gclocker/gcl001
nsk/regression/b6186200
###@###.### 2005-06-03 14:27:25 GMT

This parameters may provide better results with gcl001:

    static private int numJNIWorker = 100;
    static private int numJNIArraySize = 50000;

    static private int numGarbageProducer = 100;
    static private int numGarbageBlockSize = 100000;
    static private int numGarbageProducerSleep = 5; // unused!

    static public int numCS = 2000;


###@###.### 2005-06-06 09:14:47 GMT
###@###.### 2005-06-23 19:00:33 GMT

Comments
SUGGESTED FIX Fix integrated into mustang workspace. webrev: http://slime.india/~pb131437/webrevs/6280181/index.html Original workspace: jpsesvr:/net/jpsesvr.sfbay/jpse-int/india/pb131437/6.0/hotspot Submitter: pb131437 Archived data: /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050712023759.pb131437.hotspot/ Webrev: No webrev was generated Fixed 6280181: Concurrently memory allocation and JNI CS provoke OOM Problem: When JNI critical section is active, allocating thread throws OutOfMemory error without stalling and without giving GC a chance to run. When heap is full, jni critical section is not active and the allocating thread enters mem_allocate_work(), the old and young gen allocations fail and this thread does not stall(as jni critical section is not active at this point). After this we create VM_GenCollectForAllocation/VM_ParallelGCFailedAllocation operation to free up space. Now at this point, if JNI critical section becomes active before the collection operation is run, GC gets skipped and the allocating thread throws OutOfMemory error. Fix: The changes close the timing window through which the allocating thread slips through without stalling. Now the VM operation (VM_GenCollectForAllocation and VM_ParallelGCFailedAllocation) checks if JNI critical section is active and gives an indication to the caller that gc was locked out. The allocating thread in the caller (mem_allocate) checks for this and stalls if GC was skipped. Reviewed by: ###@###.###, ###@###.### Fix verified: Yes Verification Test: Testcases b6186200 and gcl001 attached with the bugreport Other testing: PRT Files: update: src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp update: src/share/vm/gc_implementation/shared/vmGCOperations.cpp update: src/share/vm/gc_implementation/shared/vmGCOperations.hpp update: src/share/vm/memory/collectorPolicy.cpp ###@###.### 2005-07-12 12:45:07 GMT
12-07-2005

EVALUATION transferred contents to Comments section. The high level idea (###@###.###) is that there are timing windows through which an allocating thread will slip through without stalling when it should. ###@###.### 2005-06-23 18:58:45 GMT
23-06-2005