Heap dumping performance is something our users frequently complain about. Modern JDKs already implement parallel heap dumps from diagnostic commands, which is nice. But there are single-threaded improvements that we can easily do.
If you profile current heap dumping code, then you would see that walking the class data looking for fields takes the overwhelming majority of execution time. It got much worse in JDK 21 with JDK-8292818, where looking for field data involves re-parsing the class metadata stream all the time. See `jdk21-heapdump-png` for sample profile for sample reproducer (`HeapDump.java`) on JDK 21.
Since heap dumping code runs sporadically, we can cache some class metadata that heap dumping code needs on hot paths.