When encountering call sites which do allocation, NMT records the callstack together with other information (counters, memory flags) in a record (class `AllocationSite`). It then stores these records in a hash table (`MallocSiteTable`).
Then, when generating reports, one or two baselines are generated, basically snapshots of the state of these records. These base lines contain copies of the original records, callstack and all. The records may also need to get sorted, which is done by adding them to a temporary SortedLinkedList, again by value.
This makes getting reports a bit expensive. Not massively so but enough to impose an artificial threshold on the number of records to baseline. This introduces subtle reporting errors, currently under discussion here [1][2]. In that bug I argue for the removal of that threshold for simplicity and correctness reasons.
Since the reason that threshold exists is the memory footprint of baselining, lets reduce the cost of baselining.
The callstacks we keep in the records are semantically immortal and of course immutable. When tracing is active we keep those call stacks forever. Separating the `class AllocationSite` into two classes, one containing the immutable immortal parts, and the other containing the mutable parts (counters), would reduce their size by a lot. Lets keep the immutable callstacks only once.
Unfortunately, there is a technical reason that call stacks may be deleted, in the rare case that multiple threads tried simultaneously to add the same call stack and one won. In that case, the looser would need to delete the call stack again. So the immutable part would best be refcounted from the mutable part.
[1] https://bugs.openjdk.java.net/browse/JDK-8261238
[2] https://github.com/openjdk/jdk/pull/2428