The original nmethod entry barrier supported only concurrent patching of data and was used by ZGC to solve concurrent class unloading problems. Now it is starting to see more uses. Notable, loom uses nmethod entry barriers to figure out what nmethods have been seen on-stack, needed to remove nmethods. However, the concurrent data patching variation was too slow for loom, so I brought over a faster nmethod entry barrier that we use in the generational ZGC repo, which additionally handles concurrent patching of data and instructions, which is needed there.
However, for the uses in loom, the classic GCs don't really patch anything interesting concurrently. This leads to the following possible enhancements:
1. Make a dedicated nmethod entry barrier for GCs that don't patch data nor code concurrently, consisting of basically only a conditional branch.
2. Move the "guard" word and call into the VM trampoline, to an out-of-line stub, ensuring instruction caches are not polluted by non-hot instructions at the nmethod entry. Some machines also better optimize the branch-not-taken path of conditional branches.