During JDK-8299179, I found that ArrayFill is only done if somehow unrolling is disabled or otherwise circumvented. But it would be beneficial to prefer ArrayFill over unrolling.
For example in this case we unroll instead of ArrayFill, which is surprizing to me, and should be fixed:
static void test() {
// Note: currently unrolled, not intrinsified (unless -XX:LoopUnrollLimit=1)
int arr[] = new int[22];
for (int i = 6; i < 20; i++) {
arr[i] = 1;
}
intA = arr;
}
Suggestion: fix the bug, and add an IR verification that we indeed get a node like this:
260 CallLeafNoFP === 120 1 59 8 9 (258 40 270 1 ) [[ 262 263 ]] # jint_fill void ( NotNull *+bot, int, long, half )
Do this for byte, short, int (long not yet implemented).
Further, there is some commented out code in PhaseIdealLoop::intrinsify_fill, which was supposed to detect if filling overwrites the whole array, and then remove the zeroing. It is not clear if this optimization is now done elsewhere, or was just forgotten. It could also be that since we currently mostly unroll, the unrolling then detects that we overwrite the initialization, and drops the zeroing. Hence, we may have a slowdown without removing initialization when we fill the whole array, instead of unrolling the store.