JDK-8242315 : Execute patch_archived_heap_embedded_pointers in a GC thread
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 15
  • Priority: P4
  • Status: Closed
  • Resolution: Duplicate
  • Submitted: 2020-04-07
  • Updated: 2025-01-09
  • Resolved: 2025-01-09
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
Other
tbdResolved
Related Reports
Duplicate :  
Description
FileMapInfo::patch_archived_heap_embedded_pointers() may be called during VM bootstrap. This happens, for example, if the heap size has changed significantly between CDS dump time and run time.

http://hg.openjdk.java.net/jdk/jdk/file/5ac19bd3a1e2/src/hotspot/share/memory/filemap.cpp#l1885

The default CDS archive is created with -Xmx128m to optimize for apps with small heaps (e.g., those used in cloud). However, if we run with a bigger heap (somewhat larger than 2GB), we're likely to use 0-based 3-bit shift oop compression, and thus the archived heap must be patched.

Because the patching is fairly independent, and the patched contents are not needed until the VM loads the first class, it should be safe to do the patching in a GC worker thread while the main VM thread continues with other initialization (such as interpreter generation).

============== about 2.64% degradation or 1.09ms

# No patching (dumptime heap = runtime heap)
$ java -Xshare:dump -Xmx4g
$ perf stat -r 100 java -Xmx4g -version

 Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):

             56.11 msec task-clock                #    1.360 CPUs utilized            ( +-  0.60% )
               199      context-switches          #    0.004 M/sec                    ( +-  1.04% )
                12      cpu-migrations            #    0.206 K/sec                    ( +-  2.08% )
             2,937      page-faults               #    0.052 M/sec                    ( +-  0.04% )
       137,492,599      cycles                    #    2.451 GHz                      ( +-  0.15% )
        85,290,084      stalled-cycles-frontend   #   62.03% frontend cycles idle     ( +-  0.18% )
        66,524,836      stalled-cycles-backend    #   48.38% backend cycles idle      ( +-  0.21% )
       105,195,769      instructions              #    0.77  insn per cycle         
                                                  #    0.81  stalled cycles per insn  ( +-  0.14% )
        20,009,439      branches                  #  356.625 M/sec                    ( +-  0.15% )
         1,005,784      branch-misses             #    5.03% of all branches          ( +-  0.19% )

          0.041249 +- 0.000282 seconds time elapsed  ( +-  0.68% )

# With patching (dumptime heap != runtime heap)
$ java -Xshare:dump -Xmx128m
$ perf stat -r 100 java -Xmx4g -version

 Performance counter stats for '/jdk/bld/bench/eva/loader_con0/bin/java -Xmx4g -version' (100 runs):

             56.96 msec task-clock                #    1.345 CPUs utilized            ( +-  0.55% )
               202      context-switches          #    0.004 M/sec                    ( +-  0.97% )
                12      cpu-migrations            #    0.212 K/sec                    ( +-  1.90% )
             3,118      page-faults               #    0.055 M/sec                    ( +-  0.04% )
       139,782,303      cycles                    #    2.454 GHz                      ( +-  0.14% )
        86,193,560      stalled-cycles-frontend   #   61.66% frontend cycles idle     ( +-  0.15% )
        67,261,311      stalled-cycles-backend    #   48.12% backend cycles idle      ( +-  0.17% )
       107,975,073      instructions              #    0.77  insn per cycle         
                                                  #    0.80  stalled cycles per insn  ( +-  0.14% )
        20,706,343      branches                  #  363.509 M/sec                    ( +-  0.15% )
         1,023,199      branch-misses             #    4.94% of all branches          ( +-  0.19% )

          0.042338 +- 0.000274 seconds time elapsed  ( +-  0.65% )


Comments
Parallel processing of archived heap objects will be addressed in JDK-8326035.
09-01-2025

It's possible to work-around this by avoiding patching (we can generate a custom CDS archive with the exact heap object layout needed by the app).
12-01-2023

The performance numbers listed in Description is misleading -- much of the start-up time delta is due to the heap being larger, not due to CDS heap relocation. See more accurate numbers that compares heaps of very similar sizes: https://bugs.openjdk.java.net/browse/JDK-8251330?focusedCommentId=14366742&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14366742
02-09-2020