JDK-8214356 : Verification of class metadata unloading takes a long time
  • Type: Bug
  • Component: hotspot
  • Sub-Component: runtime
  • Affected Version: 11,12
  • Priority: P3
  • Status: Resolved
  • Resolution: Fixed
  • Submitted: 2018-11-27
  • Updated: 2019-05-28
  • Resolved: 2018-11-29
The Version table provides details related to the release that this issue/RFE will be addressed.

Unresolved : Release in which this issue/RFE will be addressed.
Resolved: Release in which this issue/RFE has been resolved.
Fixed : Release in which this issue/RFE has been fixed. The release containing this fix may be available for download as an Early Access Release or a General Availability Release.

To download the current JDK release, click here.
JDK 12
12 b23Fixed
Related Reports
Relates :  
Description
During investigation of JDK-8211211 it has been found that the cause for the long test duration is that class unloading verification takes too long time.

In particular, with CMS where class unloading is delayed due to CMS precleaning, the test accumulates so many classes to unload that it easily exceeds the timeouts.

The relevant result is https://bugs.openjdk.java.net/browse/JDK-8211211?focusedCommentId=14225624&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14225624 (copied here for convenience):

The test spends 95%(!) of its runtime in ClassLoaderData::free_deallocate_list in some verification routine I think (this is a fastdebug build) 

Perf trace (mangled, but readable): 

      - 95.33% _ZN8VMThread4loopEv 
         - 95.31% _ZN20SafepointSynchronize5beginEv 
              _ZN20SafepointSynchronize16do_cleanup_tasksEv 
              _ZN20ClassLoaderDataGraph34walk_metadata_and_clean_metaspacesEv 
              _ZN20ClassLoaderDataGraph22clean_deallocate_listsEb 
            - _ZN15ClassLoaderData20free_deallocate_listEv 
               - 95.31% _ZN15ClassLoaderData20free_deallocate_listEv 
                  - 91.08% _ZN13InstanceKlass19deallocate_contentsEP15ClassLoaderData 
                     - 84.24% _ZN13InstanceKlass18deallocate_methodsEP15ClassLoaderDataP5ArrayIP6MethodE 
                        - _ZN13InstanceKlass18deallocate_methodsEP15ClassLoaderDataP5ArrayIP6MethodE 
                           - 84.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb 
                              - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm 
                                 - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                                 - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                                    - 84.21% _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_ 
                                       - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 
                                            84.19% _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 
                     - 6.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb 
                        - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm 
                           - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                           - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                              - 6.22% _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_ 
                                 - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 
                                   _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 
                     - 0.58% _ZN15ClassLoaderData13remove_handleE9OopHandle 
                          0.58% _ZN15ClassLoaderData13remove_handleE9OopHandle 
                  - 4.22% _ZN20ClassLoaderMetaspace10deallocateEP8MetaWordmb 
                     - _ZN9metaspace12SpaceManager10deallocateEP8MetaWordm 
                       _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                     - _ZN9metaspace13BlockFreelist12return_blockEP8MetaWordm 
                        - _ZN20BinaryTreeDictionaryIN9metaspace9MetablockE8FreeListIS1_EE20insert_chunk_in_treeEPS1_ 
                           - _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 
                             _ZN8TreeListIN9metaspace9MetablockE8FreeListIS1_EE20return_chunk_at_tailEP9TreeChunkIS1_S3_E 

Ultimately I think the problem is the loop in FreeList<Chunk>::verify_chunk_in_free_list(), because nop'ing that one out improves the test significantly (it does not "hang" at the first unloading phase any more and within the 30s actually manages a few concurrent cycles), but I do not know how important this check is for integrity purposes (I do not think it is that useful to verify *all* chunks every time you delete a single one by default). 

The specific problem with CMS seems to be related to precleaning - this takes very long time (~5s until it is aborted) in this case, so its concurrent cycle takes very long, as the test accumulates more and more classloader data to unload. Since the algorithm with verification is O(n^2) at least, and n is much larger than with other cases with CMS, the test takes a very long time. 

There seems to be another long verification phase at the end of the run that seems independent of gc.

JDK-8211211 hackfixed this by adding a VM option so that CMS avoids the precleaning. It would be nice to get these verification times down to reasonable levels as this also slows down other programs run in fastdebug mode that do a lot of class unloading.

One item to investigate could be looking whether it is required to verify all metaspace chunks every time there is a single class unloading.
Comments
This test is timed to run for 30 seconds and does before and after this change, but removing the verify_chunk_in_freelist call might help with sporadic timeouts. This test is a stress test that generates a lot of deletes for individual metadata objects in the ClassLoaderData->metaspace() in classes that are *not* unloaded. There's the CMS binaryTreeDictionary to hold inner metaspace free space. It doesn't make sense to verify that the metadata pointer isn't already in this free list for each delete call. This verification has never found a problem. I added a test for if (FLSVerifyDictionary) around the assert(!this->verify_chunk_in_free_list(chunk), "Double entry"); [~tschatzl] thank you for the analysis.
28-11-2018