JDK-8173988 : Unsafe usage of ClassLoaderData::_handles with CMS
  • Type: Bug
  • Component: hotspot
  • Sub-Component: gc
  • Affected Version: 9
  • Priority: P3
  • Status: Closed
  • Resolution: Duplicate
  • Submitted: 2017-02-06
  • Updated: 2017-10-06
  • Resolved: 2017-10-06
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 10
10Resolved
Related Reports
Duplicate :  
Relates :  
Description
ClassLoaderData::_handles was introduced to keep track of resolved reference array for the constant pools. The objects were created and immediately registered in the _handles area without any intervening safepoints.

From ClassLoaderData::initialize_resolved_references:
    objArrayOop stom = oopFactory::new_objArray(SystemDictionary::Object_klass(), map_length, CHECK);
    Handle refs_handle (THREAD, (oop)stom);  // must handleize.
    set_resolved_references(loader_data->add_handle(refs_handle));

There's a comment in CMSCollector::do_remark_non_parallel that describes this:
  // We might have added oops to ClassLoaderData::_handles during the
  // concurrent marking phase. These oops point to newly allocated objects
  // that are guaranteed to be kept alive. Either by the direct allocation
  // code, or when the young collector processes the roots. Hence,
  // we don't have to revisit the _handles block during the remark phase. 

However, the CLD::_handles area has been reused to store other oops that might not adhere to the requirement described in the comment above.

Two code path that register oops in the CLD::_handles are the Modules code and CDS when protection domain "handles" are created.

Two ways to fix this:

1) Make sure that the object is available in a root from the time it gets created until it has been added to the CLD::_handles list. 

The code might already do this, but it needs to be checked / enforced.

2) Do more work in the CMS remark pause and visit all non-"resolved reference array" oops.

There's currently no separation between the "resolved reference array" oops and the other oops in CLD:_handles, so the code would probably have to visit all oops, unless they were separated somehow.

Comments
This was fixed as a side effect of the bug I marked as duplicate. I should have checked in with this bug synopsis so it would have gotten closed as resolved.
06-10-2017

The objects are roots until they are inserted into the CLD::_handles block, but there may be a race such that: CMS scans _handles block mutator has module oop in a root (Handle), inserts it to the _handles block returns back to Java where the module may be unreferenced (but its pointer is in a live mirror associated with the CLD, so it'll stay referenced while the CLD is live) CMS safepoint that does not rewalk the CLD::_handle block The protection domain handle in the ModuleEntry, whose oop is stored in the CLD::_handles block may have this issue, although it is also kept live if the klass for the mirror has a pointer to it is live in the CLD. I don't know if there's an issue here. If there is, I don't see another way other than having CMS rescan the CLD::_handles during safepoint. I'm going to assign back to GC to decide what to do with it.
09-08-2017

The important part is that they are roots all the way from the creation until the insert into the CLD::_handles block. If there is even a short time when there's no root pointing to the object, and the object is moved around in the object graph on the heap, then CMS marking might not find this object during the concurrent marking phase.
08-08-2017

Both module oops and cached protection domain oops are in Handles or jobject handles from the time they become known to the VM (through define_module), otherwise they'd be like any other unhandled oops. So they're always roots of some sort before creating the entry in _handles. I'm not sure what the creation requirement is but they must be roots since creation or they'd also cause a crash (?0 Protection domain oop come through the result of a call to the JVM, is handled, then the entry in CLD::_handles is created. Why would CMS not mark it - it must be a root also (?)
02-08-2017

Thanks for writing this up. We are currently visiting the API which stores the module oop in the ModuleEntry. Kim and I have found other problems with it on Thursday. I'll describe them here later.
06-02-2017