That constructor always allocates _stack structure with unique/2 elements array:
PhaseIterGVN::PhaseIterGVN( PhaseGVN *gvn ) : PhaseGVN(gvn),
_worklist(*C->for_igvn()),
_stack(C->unique() >> 1),
_delay_transform(false)
Before incremental inlining it was allocated only once and it was fine. Now that constructor is used in incremental inlining:
igvn = PhaseIterGVN(gvn);
So each time we execute that call (for each call site inlining) we reallocate new _stack. It is not actual memory leak since it is in thread resource area which will be freed after compilation. But what I observed is with a lot of call sites (Nashorn big methods compilation) its memory consumption become very noticeable > 400Mbytes of memory (JDK-8129847):
[0x00007fa294497cb6] Arena::grow(unsigned long, AllocFailStrategy::AllocFailEnum)+0x46
[0x00007fa294b1681d] PhaseIterGVN::PhaseIterGVN(PhaseGVN*)+0x39d
[0x00007fa29465143d] Compile::inline_incrementally_one(PhaseIterGVN&)+0x13d
[0x00007fa294654beb] Compile::inline_incrementally(PhaseIterGVN&)+0x34b
(malloc=426040KB #325)
It is used only in remove_globally_dead_node() to avoid recursion. I cached it in PhaseIterGVN class to avoid allocation each time in that method which is very frequent. And it works fine before incremental inlining implementation.
There are few solutions which I can think of.
One is create new PhaseIterGVN constructor to use in incremental inlining code and copy _stack from original igvn.
An other is to move _stack to Compile class. The problem with that is we don't know what size should be - may go with 32 as I did. It is not used until first IGVN optimization. So we may as well create new stack and copy to Compile::_stack when we know number of live nodes.
Or simple do next and allow small leak without modifying code a lot:
PhaseIterGVN::PhaseIterGVN( PhaseGVN *gvn ) : PhaseGVN(gvn),
_worklist(*C->for_igvn()),
- _stack(C->unique() >> 1),
+ _stack(C->comp_arena(), 32),
_delay_transform(false)
An other problem is it uses thread local resource area by default. And I hit problem when I set small size because I think there is somewhere ResourceMark which unwind thread resource area. But _stack was grown and used that area. That is why I used compiler arena above which is not collected by default ResourceMark.