JDK-8251570 : JDK-8215624 causes assert(worker_id < _n_workers) failed: Invalid worker_id
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 16
  • Priority: P1
  • Status: Closed
  • Resolution: Fixed
  • Submitted: 2020-08-14
  • Updated: 2021-09-08
  • Resolved: 2020-08-17
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 11 JDK 16
11.0.14Fixed 16 b12Fixed
Related Reports
Relates :  
Description
'JDK-8215624: Add parallel heap iteration for jmap –histo' implemented a way to parallelize heap inspection. There is a mismatch between the number of requested parallel heap inspection threads, and the number of actually spawned threads.

The heap region claimer is initialized with the value requested by the heap inspection system, while the task spawning mechanism uses the current active number of GC threads.

#  assert(worker_id < _n_workers) failed: Invalid worker_id.

V  [libjvm.so+0xcb85dc]  HeapRegionClaimer::offset_for_worker(unsigned int) const+0x4c
V  [libjvm.so+0xaeabfb]  G1ParallelObjectIterator::object_iterate(ObjectClosure*, unsigned int)+0x3b
V  [libjvm.so+0xc9f66e]  ParHeapInspectTask::work(unsigned int)+0x7e
V  [libjvm.so+0x187cfd4]  GangWorker::run_task(WorkData)+0x84

Comments
Fix Request (11u) Please approve backporting this to OpenJDK 11u. The PR is https://github.com/openjdk/jdk11u-dev/pull/284. This fix is required as we are downporting https://bugs.openjdk.java.net/browse/JDK-8215624. The risk is low as most of the code are same as the one in jdk-master. Thanks, Lin
08-09-2021

URL: https://hg.openjdk.java.net/jdk/jdk/rev/dd827a012e43 User: stefank Date: 2020-08-17 09:33:15 +0000
17-08-2020

Initial patch was incomplete and didn't deal with all cases. See review thread: https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-August/030560.html Updated webrev: https://cr.openjdk.java.net/~stefank/8251570/webrev.02/
14-08-2020

Bumping from P4 -> P3. This failure mode is reproducing once per Tier4 job set, ~10 times per Tier5 job set and ~6 times per Tier6 job set. Update: ~6 times in a Tier7 job set.
14-08-2020

Spotted in the jdk-16+12-434-tier5 CI job set: https://mach5.us.oracle.com/mdash/jobs/mach5-one-jdk-16+12-434-tier5-20200814-1100-13437418/results?search=status%3Afailed%20AND%20-state%3Ainvalid Multiple failures of the following tests: sun/tools/jmap/BasicJMapTest.java java/util/logging/TestLoggerWeakRefLeak.java
14-08-2020

https://cr.openjdk.java.net/~stefank/8251570/webrev.01
14-08-2020

Hi Stefan, Sounds great, please add me in the cc list when you post a webrev, so I can learn from that. Thanks! Cheers, Lin
14-08-2020

Hi [~lzang] I'm In the process of testing a patch that does something similar to what you describe. However, when looking at this I realized that we use the wrong workers for ZGC and Shenandoah. The current patch I'm working on rips out the GC specific run_task functions, and instead provide a run_task_at_safepoint function that uses the "safepoint workers", that should be used when non-GC subsystems want to run tasks in parallel. void CollectedHeap::run_task_at_safepoint(AbstractGangTask* task, uint num_workers) { WorkGang* gang = get_safepoint_workers(); if (gang == NULL) { // GC doesn't support parallel worker threads. // Execute in this thread with worker id 0. task->work(0); } else { gang->run_task(task, num_workers); } } I'll post a webrev soon.
14-08-2020

Hi Stefan, I did a quick investigation. it seem one possible reason is G1CollectedHeap::run_task(AbstractGangTask* task) use an fixed "workers()->active_workers()", which can be equal to parallelGCThreads, but the parallel heap inspection thread number was set seperately. One solution maybe change the run_task(AbstractGangTask* task) to run_task(AbstractGangTask* task, uint workers) and pass heap iteration parallel thread number to it when do heap inspection. Do you think it is reasonable? P.S. the run_task() is used only for paralle heap inpection at present. and I can provide a fix ASAP if it is ok. BRs, Lin
14-08-2020

Reproducible by bumping up the number of parallel GC threads: make -C ../build/fastdebug test TEST="java/util/logging/TestLoggerWeakRefLeak.java" JTREG="JAVA_OPTIONS=-XX:ParallelGCThreads=100"
14-08-2020

Ping [~lzang] [~phh] [~sspitsyn]. The jmap changeset had a problem. I'll try to deal with asap.
14-08-2020