Currently when array is large (>FastAllocateSizeLimit) C2 calls runtime which does allocation and zeroing. As result when large array allocation is followed by arraycopy zeroing elimination optimization does not happened.
Add a new runtime call for large arrays which will only return the pointer to new array without zeroing it. And the compiled code will do zeroing elimination and using ClearArray to zero the rest of array. We need to watch out for safepoints/deoptimization on the return from runtime call where it is expected that arrays are initialized. Also ClearArray should be precise since a large array could be allocated not in TLAB and is followed by an other object.