JDK-8213198 : Not triggering concurrent cycle in G1 leaves string table cleanup deferred
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 12
  • Priority: P3
  • Status: Open
  • Resolution: Unresolved
  • Submitted: 2018-10-31
  • Updated: 2024-09-11
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdUnresolved
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
For G1 GC, when the (interned) string table gets cleaned is dictated by when mixed GCs happen. Young GCs are insufficient for string table cleaning as a complete assessment of liveness is prerequisite to determining if any given interned string can be disposed of.

It is possible that a program only experiences young collections while creating large numbers of interned strings, occupying extra native memory with unneeded string entries. In extreme cases, this can prematurely exhaust physical memory.

This problem should be very rare in real world applications, but the behavior can be reproduced by a test program (see attachment). Running this program, and observing the GC log as well as NMT, one can see native memory growing and growing without mixed GC happening. It interns lots of different strings and cuts them lose periodically. This is similar to a stateless request handling micro service that uses a (JSON) serialization mechanism which interns strings.

If interned strings and all elements of their table were completely represented as Java objects inside the Java heap, then GC would automatically trigger their cleanup (and all unused memory would be recovered). Short of that solution, there ought to be an additional mechanism that triggers mixed GCs in a timely fashion, e.g., based on string table growth. Alternatively, periodic mixed collections could also prevent unbounded symbol table growth.
Comments
Changed fix version to tbd_major after talking about this with CR filer.
12-12-2018

Another workaround is for the application to regularly call System.gc() with -XX:+ExplicitGCInvokesConcurrent.
05-12-2018

The change mentioned in "JDK-8213229: Investigate treating StringTable as weak in G1 young collections" does not fix the issue in the reproducer: depending on young gen size it may too often trigger young collections, i.e. promote the string objects too quickly.
05-12-2018

Lowering priority after considering that such applications need to do lots of interning and little other allocation activity; further this has been reported only now and an issue since forever, i.e. may only affect a tiny amount of applications.
21-11-2018

Implementation of JDK-8204089 in JDK12 may improve the problem in some situations due to "periodic" concurrent cycles.
09-11-2018

Because the interned strings are not reclaimed by young collections, they will eventually be promoted to the old gen; enough of that will trigger a concurrent (or full) collection if nothing else does. I don't have an opinion about what to do (if anything) about this in JDK 8. For current JDK I think changing the treatment of the StringTable by young collections is probably a better solution.
01-11-2018

Here is a webrev for a patch that fixes the problem in OpenJDK 8, for perusal: http://cr.openjdk.java.net/~bmathiske/8213198/webrev.00/ This change triggers concurrent marking plus mixed GC every time the string table grows by a "large enough number to amortize a mixed GC, but not too large", since the last shrinking. I am estimating here that 1M should just work. If this is an acceptable approach, I would adapt the patch to the latest release. However, perhaps other approaches might be preferable for later versions of G1 than what we have in OpenJDK 8?
31-10-2018

Note that interned strings *are* Java objects. The native memory involved in the StringTable growth is the table itself and the bookkeeping table entries referring to those Java objects. A different way to address this problem might be to change the string table from being treated as a strong root in G1 young collections to being a weak root. This has been discussed as a possible follow-on to the recent re-implementation of StringTable using ConcurrentHashtable and OopStorage; I thought there was an RFE to investigate that, but can't find one.
31-10-2018