United StatesChange Country, Oracle Worldwide Web Sites Communities I am a... I want to...
Bug ID: JDK-6280181 Concurrently memory allocation and JNI CS provoke OOM
JDK-6280181 : Concurrently memory allocation and JNI CS provoke OOM

Details
Type:
Bug
Submit Date:
2005-06-03
Status:
Closed
Updated Date:
2012-02-01
Project Name:
JDK
Resolved Date:
2005-07-20
Component:
hotspot
OS:
generic
Sub-Component:
gc
CPU:
generic
Priority:
P3
Resolution:
Fixed
Affected Versions:
6
Fixed Versions:

Related Reports
Backport:
Backport:
Relates:

Sub Tasks

Description
Despite fix for 6186200 was delivered we have still some problems in memory allocation concurently with native threads entering JNI critical sections. Later improvements in GC locker as reflected with Mustang jplan-283 feature give us hope on increasing concurrency of JNI CS and java threads.

I have created two test to cover this (in attachment). 

note: gcl001 is stress test so you may need additional tuning to reproduce its failures. -Xmx64M almost required.

My runs provide following results (Mustang b37):
1) linux-suse9.2-i586
  1.1) -XX:+UseSerialGC (default for client)
	gcl001 - FAIL, b6186200 - PASS
  1.2) -XX:+UseParallelGC
	gcl001 - PASS, b6186200 - FAIL
  1.3) -XX:+UseConcMarkAndSweepGC
	gcl001 - FAIL, b6186200 - PASS

2) Solaris-sparc 10
  1.1) -XX:+UseSerialGC (default for client)
	gcl001 - FAIL, b6186200 - PASS
  1.2) -XX:+UseParallelGC
	gcl001 - PASS, b6186200 - PASS
  1.3) -XX:+UseConcMarkAndSweepGC
	gcl001 - PASS(slow), b6186200 - PASS


PS More intresting fact is that b6186200 never fails on solaris even with very old builds. So it worked fine ever before 6186200 integration. Linux version produce failure as expected.

PPS Full test names (for UTB script):
nsk/stress/jni/gclocker/gcl001
nsk/regression/b6186200
###@###.### 2005-06-03 14:27:25 GMT

This parameters may provide better results with gcl001:

    static private int numJNIWorker = 100;
    static private int numJNIArraySize = 50000;

    static private int numGarbageProducer = 100;
    static private int numGarbageBlockSize = 100000;
    static private int numGarbageProducerSleep = 5; // unused!

    static public int numCS = 2000;


###@###.### 2005-06-06 09:14:47 GMT
###@###.### 2005-06-23 19:00:33 GMT

                                    

Comments
EVALUATION

transferred contents to Comments section. The high level idea
(###@###.###) is that there are timing windows through which
an allocating thread will slip through without stalling when it should.


###@###.### 2005-06-23 18:58:45 GMT
                                     
2005-06-23
SUGGESTED FIX

Fix integrated into mustang workspace.
webrev: http://slime.india/~pb131437/webrevs/6280181/index.html 

Original workspace:     jpsesvr:/net/jpsesvr.sfbay/jpse-int/india/pb131437/6.0/hotspot
Submitter:              pb131437
Archived data:          /net/prt-archiver.sfbay/data/archived_workspaces/main/gc_baseline/2005/20050712023759.pb131437.hotspot/
Webrev:                 No webrev was generated

Fixed 6280181: Concurrently memory allocation and JNI CS provoke OOM

Problem:  When JNI critical section is active, allocating thread throws OutOfMemory error without stalling and without giving GC a chance to run.

When heap is full, jni critical section is not active and the allocating thread enters mem_allocate_work(), the old and young gen allocations fail and this thread does not stall(as jni critical section is not active at this point). After this we create VM_GenCollectForAllocation/VM_ParallelGCFailedAllocation operation to free up space. Now at this point, if JNI critical section becomes active before the collection operation is run, GC gets skipped and the allocating thread throws OutOfMemory error.

Fix:  The changes close the timing window through which the allocating thread slips through without stalling. Now the VM operation (VM_GenCollectForAllocation and VM_ParallelGCFailedAllocation) checks if JNI critical section is active and gives an indication to the caller that gc was locked out. The allocating thread in the caller (mem_allocate) checks for this and stalls if GC was skipped.

Reviewed by: ###@###.###, ###@###.###

Fix verified: Yes

Verification Test: Testcases b6186200 and gcl001 attached with the bugreport

Other testing: PRT

Files:
update: src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp
update: src/share/vm/gc_implementation/shared/vmGCOperations.cpp
update: src/share/vm/gc_implementation/shared/vmGCOperations.hpp
update: src/share/vm/memory/collectorPolicy.cpp

###@###.### 2005-07-12 12:45:07 GMT
                                     
2005-07-12



Hardware and Software, Engineered to Work Together