JDK-8139889 : JEP 278: Additional Tests for Humongous Objects in G1
  • Type: JEP
  • Component: hotspot
  • Sub-Component: gc
  • Priority: P3
  • Status: Closed
  • Resolution: Delivered
  • Fix Versions: 9
  • Submitted: 2015-10-19
  • Updated: 2017-04-10
  • Resolved: 2017-04-10
Related Reports
Blocks :  
Blocks :  
Blocks :  
Blocks :  
Blocks :  
Blocks :  
Blocks :  
Relates :  
Relates :  
Relates :  
Relates :  
Relates :  
Description
Summary
-------

Develop additional white-box tests for the Humongous Objects feature of
the G1 Garbage Collector.


Non-Goals
---------

We will not develop tests for G1 Eager Reclamation.


Description
-----------

Garbage First (G1) is a generational garbage collector which divides the
heap into equal-sized regions.  It has a concurrent collection phase,
which can work in parallel with the application, and is multi-threaded.

G1 treats objects bigger than one-half of a memory region, called
_humongous objects_, differently than other objects:

  - Humongous objects always take up a number of regions. If a humongous
    object is smaller than one region then it takes up the whole
    region. If a humongous object is larger than N regions and smaller
    than (N+1) regions then it takes up (N+1) regions. No allocations are
    allowed in the free space, if any, of the last region.

  - They can only be collected at the end of concurrent marking cycle,
    during a full GC or in young GC in case of G1 Eager Reclaim

  - They can never be moved from one region to another.

Since G1 is a concurrent and multi-threaded GC, it makes black-box
testing very difficult. Several ways to collect dead objects, a few
concurrent threads, the ability to work in parallel with the running
application, and generally complex algorithms make it nearly impossible
to figure out G1's internal state. To address these issues we will extend
the WhiteBox API and implement Java tests that use this API to check G1's
internal state. We will also be able to reuse these newly developed
WhiteBox API methods in stress tests.

To test that the code which handles humongous objects works as expected
we need G1 to provide more details about the internal representation of
humongous objects on the heap. We will add additional debug methods to G1
which will allow us to get information from its internal data structures
and provide control over the initiation of garbage collection. The latter
is important because there are three code paths which can collect
unreachable humongous objects: A full garbage collection, concurrent
marking and young GC in case of G1 Eager Reclamation. To test each path
we need to avoid the other.

To help with this we will extend the WhiteBox API with:

  - Methods to block and initiate concurrent marking and full GCs.

  - Methods to enumerate G1's regions and access region attributes (e.g.,
    free/occupied/humongous).

  - Methods to access internal G1 variables such as free memory, region
    size, and the number of free regions.

  - Methods to locate regions in the heap, to check that no allocations
    happen in regions that belong to humongous objects.  (This could
    potentially be a first step to a "heap walker" API that allows us to
    fully iterate over the Java heap).


Alternatives
------------

Possible alternatives are:

  - Native built-in JVM tests.  Such tests could be started with a JVM
    flag. They are not suitable since test failures would likely lead to
    crashes. The JVM should be able to continue to work after a failing
    test, which is not guaranteed with this approach.

  - Native tests. This would require adding debug methods to G1 code, and
    in fact developing a native WhiteBox API. There are certain drawbacks
    with this approach: We would not be able to use these debug methods
    for stress tests. More importantly, however, there is still no
    native testing framework.


Risks and Assumptions
---------------------

New tests may require changes in G1.  This could impact the performance
and stability of G1, though we think that is unlikely.  If G1 is
negatively affected then we could build product binaries without debug
methods.


Comments
Jon, The only things I added to White Box API so far were two methods that in my opinion don't change anything in G1. The first method checks if provided address is contained in a humongous region and the second one checks if provided address is contained in a free region. If I have any plans to add any other WB methods I will certainly let you know as soon as possible. Thank you.
10-11-2015

Work is being done on G1 to produce pauses that achieve the pause time goal with more regularity to provide more predictable pauses and is being done to achieve lower and lower pause times. Any change to G1 for JEP 278 that results in a measurable increase will likely not be acceptable. Additionally a significant effort is being spent on code clean up to G1 to make the code more readable (simpler to understand). Any changes will be reviewed with the goal of more readable code in mind. I'd suggest an early design review with the development team for code changes made for this JEP.
09-11-2015