Relates :
|
|
Relates :
|
|
Relates :
|
In low-level benchmarking, we sometimes resort to non-inlineable "sink" methods to escape dead-code elimination, like this: @Benchmark public void test() { doNothing(obj); } @CompilerControl(CompilerControl.Mode.DONT_INLINE) public void doNothing(Object obj) { // deliberately do nothing } The performance of this method is very important, since we usually deal with nanosecond-scale benchmarks. Ideally, the generated code should contain a "ret" right away. However, the generated code for doNothing contains prolog followed immediately with epilog: [Verified Entry Point] 10.93% 6.25% 0x00007f39f415fd80: mov %eax,-0x14000(%rsp) 3.76% 3.03% 0x00007f39f415fd87: push %rbp 1.92% 1.97% 0x00007f39f415fd88: sub $0x30,%rsp 10.42% 10.64% 0x00007f39f415fd8c: add $0x30,%rsp 2.88% 3.03% 0x00007f39f415fd90: pop %rbp 25.45% 31.68% 0x00007f39f415fd91: test %eax,0x15df8369(%rip) # 0x00007f3a09f58100 ; {poll_return} 0.57% 0.47% 0x00007f39f415fd97: retq It seems that at least RSP operations are redundant, as well as saving/restoring RBP. It would be interesting to see if we can remove these redundant ops, e.g.: *) Peephole MachPrologNode -> MachEpilogNode out completely; *) Macro-expand MachProlog/EpilogNode into the individual ops, and then peephole (sub $const, %reg) -> (add $const, %reg) and (push %reg) -> (pop %reg); *) Massage frame_size_in_bytes() so that it is zero for empty method; Benchmark: http://cr.openjdk.java.net/~shade/8130398/EmptyMethod.java Runnable JAR: http://cr.openjdk.java.net/~shade/8130398/benchmarks.jar Output and disasssembly: http://cr.openjdk.java.net/~shade/8130398/perfasm.out
|