JDK-8180048 : Interned string and symbol table leak memory during parallel unlinking
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 8,9
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • CPU: generic
  • Submitted: 2017-05-09
  • Updated: 2023-09-04
  • Resolved: 2017-05-15
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10 JDK 7 JDK 8 JDK 9 Other
10Fixed 7u161Fixed 8u144Fixed 9 b170Fixed openjdk7uFixed
Related Reports
Blocks :  
Relates :  
Description
FULL PRODUCT VERSION :
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+168)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+168, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Linux 2.6.32-642.11.1.el6.x86_64 #1 SMP Wed Oct 26 10:25:23 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux, Red Hat Enterprise Linux Server release 6.8 (Santiago)
Windows 7 Enterprise 64-bit, Service pack 1

A DESCRIPTION OF THE PROBLEM :
Other bugs, for example JDK-8015332, state that G1 only collects unused interned strings when it performs a full GC. In our test case they are not even connected with a full GC, they are not collected at all. Furthermore the maximum metaspace size is not honored, native memory is allocated for the interned strings even after the maximum has been reached.

With the classic concurrent mark sweep the memory is reclaimed correctly, but not with G1.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Use the Java code below to generate and intern strings and to trigger full garbage collections. Run the program like this:
-------------------
java -XX:NativeMemoryTracking=detail -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics -XX:MetaspaceSize=100m -XX:MaxMetaspaceSize=100m -XX:+UseStringDeduplication -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xloggc:gc-jdk8-marksweep.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xms256m -Xmx256m -cp classes StringInterner
java -XX:NativeMemoryTracking=detail -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics -XX:+UseG1GC -XX:MetaspaceSize=100m -XX:MaxMetaspaceSize=100m -Xloggc:gc-jdk8-g1.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xms256m -Xmx256m -cp classes StringInterner
-------------------
Check the output and search for Symbol. We got:

Symbol (reserved=27549KB, committed=27549KB)
Symbol (reserved=424789KB, committed=424789KB)

Note that the memory for symbols (interned strings etc) is 27M for concurrent mark sweep and 424M for G1 when the maximum metaspace size was configured to be 100M.

Just in case, double-check the gc logs to make sure that there were full gc:s performed. We got four lines similar to:

2017-05-09T10:29:24.738+0000: 24.028: [Full GC (System.gc())  124M->386K(256M), 0.0570296 secs]

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
1) The interned strings should be garbage collected when G1 performs a full gc, so we should get a similar size for Symbol with G1 as we got with concurrent mark sweep.

2) The maximum amount of native memory allocated should never be larger than the configured maximum metaspace size, 100M.
ACTUAL -
The native memory occupied by interned strings continued to grow with G1 passed the maximum metaspace size even though full garbage collections were performed.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
public class StringInterner {
    public static volatile String lastString;

    public static void main(String[] args) {
        for (int iterations = 0; iterations < 40;) {
            String baseName = UUID.randomUUID().toString();
            for (int i = 0; i < 1_000_000; i++) {
                lastString = (baseName + i).intern();
            }
            if (++iterations % 10 == 0) {
                System.gc();
            }
            LockSupport.parkNanos(500_000_000);
        }
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Use another garbage collector, in our case concurrent mark sweep.


Comments
Verified by test manually by Thomas.
26-07-2017

URL: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/d6d7e5caf497 User: lana Date: 2017-05-17 18:48:03 +0000
17-05-2017

URL: http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/d6d7e5caf497 User: tschatzl Date: 2017-05-15 11:17:50 +0000
15-05-2017

Some comment about the second expectation of that test: - NMT tracks native memory allocated by the VM not managed in Metaspace. Which means that they are not limited by MaxMetaspaceSize in general. - the issue is that some C++ helper data structure for managing the interned string list leaks in G1; it is shown in the "Symbol" category in NMT output, but that has nothing to do with Metaspace.
11-05-2017

Deleted comment
10-05-2017

During unlinking of stale entries from the hash table we move old entries to a free list of entries. This move to the free list is not thread safe, so entries are lost (and not reused in the future) when doing string and symbol table unlinking in parallel.
10-05-2017

Impact = H (memory leak with interned strings with g1) Likelihood = M Workaround = H (None / use another GC / use -XX:ParallelGCThreads=1) HMH => P1
10-05-2017

Below are the observations with UseConcMarkSweepGC when MetaspaceSize=100m 8u152 == Symbol (reserved=20206KB, committed=20206KB) (malloc=19686KB #2440) (arena=520KB #1) 9 ea b168 == Symbol (reserved=39187KB, committed=39187KB) (malloc=38284KB #5981) (arena=903KB #1) == With G1GC when MetaspaceSize=100m 8u152 == Symbol (reserved=794118KB, committed=794118KB) (malloc=793598KB #99179) (arena=520KB #1) 9 ea b168 == Symbol (reserved=813491KB, committed=813491KB) (malloc=812588KB #102769) (arena=903KB #1) == Observation with G1GC, native memory occupied by interned strings continued to grow and full GC is unable to reclaim.
10-05-2017