Relates :
|
|
Relates :
|
Reduction vector optimization could be expensive for simple expressions because it uses several additional instructions per vector. For example, simple test shows that only int[] multiply has benefit from this optimization (see below). We need to restrict reduction optimization only to cases when it is beneficial. On SB (AVX1): java -XX:-SuperWordReductions -XX:-TieredCompilation -Xbatch -XX:CompileCommand=exclude,Reduction::main Reduction Warmup ... Warmup is done in 1160 msec sum int: 121 sum long: 120 sum float: 362 sum double: 353 mul int: 356 mul long: 357 mul float: 593 mul double: 588 java -XX:+SuperWordReductions -XX:-TieredCompilation -Xbatch -XX:CompileCommand=exclude,Reduction::main Reduction Warmup ... Warmup is done in 1216 msec sum int: 123 sum long: 119 sum float: 436 sum double: 407 mul int: 204 mul long: 349 mul float: 669 mul double: 639 On NHM (AVX2): java -XX:-SuperWordReductions -XX:-TieredCompilation -Xbatch -XX:CompileCommand=exclude,Reduction::main Reduction Warmup ... Warmup is done in 1408 msec sum int: 139 sum long: 139 sum float: 412 sum double: 412 mul int: 412 mul long: 412 mul float: 686 mul double: 686 java -XX:+SuperWordReductions -XX:-TieredCompilation -Xbatch -XX:CompileCommand=exclude,Reduction::main Reduction Warmup ... Warmup is done in 1340 msec sum int: 123 sum long: 139 sum float: 412 sum double: 412 mul int: 209 mul long: 412 mul float: 686 mul double: 687