JDK-8225797 : OldObjectSample event creates unexpected amount of checkpoint data
  • Type: Enhancement
  • Component: hotspot
  • Sub-Component: jfr
  • Affected Version: 11,12,13
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2019-06-14
  • Updated: 2022-02-10
  • Resolved: 2019-09-14
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 13 JDK 14
11.0.7Fixed 13.0.4Fixed 14 b15Fixed
Related Reports
Relates :  
Relates :  
Relates :  
Relates :  
Description
Recordings where stackTrace has been enabled for OldObjectSample events creates an unexpected amount of data. 

For example, a recording where OldObjectSample is disabled is 2 MB while one where enabled and stackTrace=true becomes 10 MB. Most likely excessive (or redundant) information is written down. 'jfr summary' reveals that the space is taken by the checkpoint event

The parser log shows a pattern where a large checkpoint happens multiple times. See attached log.txt

[0.699s][trace][jfr,system,parser] New constant pool: startPosition=12945672, size=370710, deltaToNext=-11695, flush=false, poolCount=6
[0.699s][trace][jfr,system,parser] Constant: java.lang.Class[1613]
[0.700s][trace][jfr,system,parser] Constant: jdk.types.Package[281]
[0.700s][trace][jfr,system,parser] Constant: jdk.types.Module[7]
[0.700s][trace][jfr,system,parser] Constant: jdk.types.ClassLoader[13]
[0.700s][trace][jfr,system,parser] Constant: jdk.types.Method[3775]
[0.701s][trace][jfr,system,parser] Constant: jdk.types.Symbol[5470]
...

Each large checkpoint is about 400 kB.

The log can be reproduced like this:

$ jfr -J-Xlog:jfr+system+parser=trace --events Dummy recording.jfr

There could possibly be a similar problem where reference chains are recorded (cutoff > 0)
Comments
Fix request (13u): I would like to backport this fix into 13u. The change applies almost cleanly, see discussion in thread https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2020-May/003197.html Tested with jdk/jfr and tier1 tests.
29-05-2020

Fix Request Without this fix the OldObjectSample event is not usable in production since the recording size will keep on growing due to always only adding to the checkpoint data. For the memory leak detection, which is the main use case for this event, the process would need to be observed during longer time span and that will trigger the problem with the recording size. The risk is low to moderate - while it is rather complex it affects only JFR. The patch does not apply cleanly and needs a number of prerequisite backports. The prerequisite issues and the public review link are listed in the following block: # Prerequisites (listed in the order they should be applied) * https://bugs.openjdk.java.net/browse/JDK-8209850 * https://bugs.openjdk.java.net/browse/JDK-8209976 * https://bugs.openjdk.java.net/browse/JDK-8210024 * https://bugs.openjdk.java.net/browse/JDK-8214850 * https://bugs.openjdk.java.net/browse/JDK-8209802 # Public review https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2019-November/002115.html The review thread deals also with all the necessary prerequisites. The fully applied fix was tested using jdk_tier1, jdk_tier2, jdk_core and jdk_jfr test sets. All tests are passing.
16-11-2019

URL: https://hg.openjdk.java.net/jdk/jdk/rev/caa25ab47aca User: mgronlun Date: 2019-09-14 12:45:55 +0000
14-09-2019

The problem can be quantified by checking the size of the "jdk.CheckPoint".event. $ jfr summary <recording-chunk/file> It may be more accurate than the file size.
04-09-2019

An increase of 3 MB for Old Object Sample event sounds more than I would expect. There is probably still a lot of redundancy, but perhaps tricky to fix? That said, allocation sampling is an opt-in / profiling feature, so it might be OK.
27-08-2019

Someone from Datadog will be looking at this within the next two weeks.
19-08-2019