Bug ID: JDK-8214542 JFR: Old Object Sample event slow on a deep heap in debug builds

Type: Bug
Component: hotspot
Sub-Component: jfr
Affected Version: 12

Priority: P3
Status: Resolved
Resolution: Fixed

Submitted: 2018-11-30
Updated: 2020-04-29
Resolved: 2019-07-03

JDK 11	JDK 13	JDK 14	Other
11.0.6Fixed	13 b29Fixed	14Fixed	openjdk8u-jfr-incubatorFixed

The test test\jdk\jdk\jfr\event\oldobject\TestHeapDeep.java tries to locate memory leaks in a small heap that consists of a linked list of 100 000 elements.The test passes, but it executes slowly, about 30 seconds instead of a few milliseconds.

This is a product bug.

Replacing jdk8u-fix-request with link to JDK-8239140
17-02-2020
RFC: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-January/011063.html
30-01-2020
Addition to 11u fix request by [~hirt]: Patch did not apply cleanly. Review request sent out on mailing list: https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2019-September/001811.html Testing shows no regressions. Will submit this item immediately follwed by JDK-8228834, since the latter fixes a regression introduced by this fix. I also include JDK-8229437 when pushing this, since it is a regression fix for this item as well.
03-09-2019
Fix request comment: This fix should be backported to make it viable to use the old object sample in production systems. It depends on fixing and backporting JDK-8225797, without which the event cannot be used in production anyways.
22-07-2019
I cannot see any time complexity algorithmic issues in the code that would explain what you describe, except some superheavy assertions like this: // bfsclosure.cpp // if we are processinig initial root set, don't add to queue if (_current_parent != NULL) { 141: assert(_current_parent->distance_to_root() == _current_frontier_level, "invariant"); _edge_queue->add(_current_parent, reference); } There are memory complexity issues that worsen in fastdebug / slowdebug runs because of the above assertion keeping a high working set for a longer time period. This gets very bad for concurrent runs (again fastdebug / slowdebug) where a -conc:16 setting leads to almost having a 8GB memory footprint, with just running the OldObject tests. I did not see this using product builds. Is it possible that a concurrent test run with fastdebug / slowdebug could have caused swappping due to the high memory footprint? There are many improvements to trim the working sets: 1. We determine the BitSet for marking as a value derived from Universe->heap()->reserved_region(). This is almost always 8GB regardless of usages. This leads to a committed bitset of 127MB which is a bit wasteful, if the current occupance is in the vicinity of 1-2 MB. I have elaborated on building a page manager plug-in to the virtual memory backing storage. This only commits the pages that are needed. The downside is that we then need to check a bitset for the mapped page in order to check the bitset for the marked address...127 MB might be ok since we are running exclusively in a VM Operation... 2. Currently we store the entire chain in the EdgeStore, but we only serialize the top 100 edges and the root 100 edges. I have updated so that we only store the edges needed. In addition, we do not need the intermediate Edge[] in the process of storing edges. 3. I also saw that the BitSet uses par_set_bit which uses cmpxchg() underneath - we don't need cas since we are single threaded (this is more of a performance optimization).
17-12-2018
Impact: High This is a scalability issue, time spent searching for a chain of length n with k samples is O(n*k) instead of O(n). This means the algorithm is up to 256 times as slow. In the test with a rather small heap, it meant 15 s in a safepoint. This is not acceptable in a production environment. Likelihood: Low/Medium How big the problem is depends on the heap size and the length of the longest chain, so it is hard to say something conclusive about the likelihood as it depends o the application. This only happens if you request path to GC root. Workaround: High There is no known workaround H (L/M) H = > P1/P2
13-12-2018

Duplicate :	JDK-8221475 - Unnecessary safe point when using OldObject sample events with cutoff=0
Relates :	JDK-8228834 - Regression caused by JDK-8214542 not installing complete checkpoint data to candidates
Relates :	JDK-8229437 - assert(is_aligned(ref, HeapWordSize)) failed: invariant
Relates :	JDK-8239140 - Backport JFR to OpenJDK 8