JDK-7112034 : Parallel CMS fails to properly mark reference objects
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 7
  • Priority: P4
  • Status: Closed
  • Resolution: Fixed
  • OS: linux
  • CPU: x86
  • Submitted: 2011-11-15
  • Updated: 2012-03-22
  • Resolved: 2012-01-20
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 6 JDK 7 JDK 8 Other
6u32Fixed 7u4Fixed 8Fixed hs20.7Fixed
Description
FULL PRODUCT VERSION :
java version "1.7.0-ea"
Java(TM) SE Runtime Environment (build 1.7.0-ea-b141)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b11, mixed mode)


FULL OS VERSION :
RHEL 6

A DESCRIPTION OF THE PROBLEM :
Parallel CMS (ie with more than one CMS marking thread) does not enqueue all the dead Reference objects in the old gen.

If you have very large numbers of dead Reference objects (in this case Finalizable objects) then running CMS with 2 marking threads appears to cause only a small fraction of Reference objects to be identified.

The unmarked objects build up in old gen and eventually a STW Full GC is triggered which enqueues the dead Reference objects. Running with -XX:ConcGCThreads=1 also fixes the problem.

THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Compile the attached sample code:

javac Foo.java

Run as indicated in "expected and actual results".



EXPECTED VERSUS ACTUAL BEHAVIOR :
Try running using default Conc GC Threads (but you need at least an 8 core machine in order to get 2 or more Conc GC Threads):

java -Xmx3g -Xms3g -verbose:gc  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xmn160m -XX:MaxNewSize=160m -XX:+UseConcMarkSweepGC -XX:+PrintReferenceGC  Foo 4 30000 500

Eventually you get the CMS Reference object output:

154.159: [GC[YG occupancy: 60084 K (147456 K)]154.159: [Rescan (parallel) , 0.0205280 secs]154.180: [weak refs processing154.180: [SoftReference, 0 refs, 0.0000090 secs]154.180: [WeakReference, 0 refs, 0.0000030 secs]154.180: [FinalReference, 3735359 refs, 0.8371420 secs]155.017: [PhantomReference, 0 refs, 0.0000060 secs]155.017: [JNI Weak Reference, 0.0000030 secs], 0.8372480 secs]155.017: [scrub string table, 0.0000280 secs] [1 CMS-remark: 1748303K(2981888K)] 1808387K(3129344K), 1.0783830 secs] [Times: user=1.21 sys=0.00, real=1.08 secs]


In this case it marked 3,735,359 Finalizable objects. This is only a small fraction of the total dead Finalizable objects in the old gen.

However if you run again, but this time with only one marking thread then all the Finalizable objects get marked:

java -Xmx3g -Xms3g -verbose:gc  -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xmn160m -XX:MaxNewSize=160m -XX:+UseConcMarkSweepGC -XX:+PrintReferenceGC -XX:ConcGCThreads=1  Foo 4 30000 500

150.942: [GC[YG occupancy: 23011 K (147456 K)]150.942: [Rescan (parallel) , 0.0089810 secs]150.951: [weak refs processing150.951: [SoftReference, 0 refs, 0.0000090 secs]150.951: [WeakReference, 0 refs, 0.0000030 secs]150.951: [FinalReference, 23458371 refs, 5.9956020 secs]156.947: [PhantomReference, 0 refs, 0.0000040 secs]156.947: [JNI Weak Reference, 0.0000030 secs], 5.9957030 secs]156.947: [scrub string table, 0.0000310 secs] [1 CMS-remark: 1744576K(2981888K)] 1767587K(3129344K), 7.5680620 secs] [Times: user=7.61 sys=0.01, real=7.56 secs]

In this case, the single-threaded CMS correctly enqueues 23,458,371 dead Finalizable objects.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
No error message. Eventually you get a STW Full GC.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.util.Random;
import java.util.concurrent.atomic.AtomicLong;

class Fin {
        static AtomicLong al = new AtomicLong();
        int val;

        public Fin(int val) {
                this.val = val;
        }

        public int val() {
                return val;
        }

        protected void finalize() throws Throwable {
                long expect = al.incrementAndGet();
                if (expect == 100000)  {
                        al.set(0);
                        System.out.println("100K finalizations");
                }
        }
}

public class Foo extends Thread {
        static int sz;
        static int sleep;

        int thrNum;

        public Foo(int thrNum) {
                this.thrNum = thrNum;
        }

        public void run() {
                Random rnd = new Random();
                long last = System.currentTimeMillis();
                int index = 0;
                while (true) {
                        for (int i=0; i<sz; i++) {
                                Fin fin = new Fin(rnd.nextInt());
                                if (fin.val() == 0xdeadbeef) {
                                        System.out.println(fin.val());
                                }
                        }

                        String string = new String(thrNum + " " + index);
                        if (string.length() < 9)
                                index++;
                        else
                                index = 0;
                        try {
                                Thread.sleep(sleep);
                        } catch (Exception e) {}
                        long now = System.currentTimeMillis();
                        System.out.println("Thr = " + thrNum + " " + (now-last) + " msecs");
                        last = now;
                }
        }

        public static void main(String[] args) throws Exception {
                int numThreads = Integer.parseInt(args[0]);
                sz = Integer.parseInt(args[1]);
                sleep = Integer.parseInt(args[2]);

                for (int i=0; i<numThreads-1; i++) {
                        new Foo(i).start();
                }
                new Foo(numThreads-1).run();
        }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Don't create humungous numbers of Finalizable objects! But this is a general problem with all Reference type objects, it just manifested itself in our environment as a Finalizer problem.

Comments
EVALUATION http://hg.openjdk.java.net/lambda/lambda/hotspot/rev/f1391adc6681
22-03-2012

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-emb/hotspot/rev/f1391adc6681
15-12-2011

EVALUATION http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1391adc6681
06-12-2011

EVALUATION The cause of this bug is that no reference processor is registered with the marking closure that is used during work stealing in the concurrent mark phase. If we add the reference processor we find the expected number of Finalizer objects during concurrent marking.
24-11-2011